package execution

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. abstract class BaseHashJoinIterator extends SplittableJoinIterator
  2. class BatchPartitionIdPassthrough extends Partitioner

    A dummy partitioner for use with records whose partition ids have been pre-computed (i.e.

    A dummy partitioner for use with records whose partition ids have been pre-computed (i.e. for use on RDDs of (Int, Row) pairs where the Int is a partition id in the expected range).

  3. class CoalescedBatchPartitioner extends Partitioner

    A Partitioner that might group together one or more partitions from the parent.

  4. class ConditionalHashJoinIterator extends BaseHashJoinIterator

    An iterator that does a hash join against a stream of batches with an inequality condition.

    An iterator that does a hash join against a stream of batches with an inequality condition. The compiled condition will be closed when this iterator is closed.

  5. class ConditionalNestedLoopExistenceJoinIterator extends ExistenceJoinIterator
  6. class ConditionalNestedLoopJoinIterator extends SplittableJoinIterator
  7. class CrossJoinIterator extends AbstractGpuJoinIterator

    An iterator that does a cross join against a stream of batches.

  8. abstract class ExistenceJoinIterator extends Iterator[ColumnarBatch] with TaskAutoCloseableResource with Arm

    Existence join generates an exists boolean column with true or false in it, then appends it to the output columns.

    Existence join generates an exists boolean column with true or false in it, then appends it to the output columns. The true in exists column indicates left table should retain that row, the row number of exists equals to the row number of left table.

    e.g.: select * from left_table where left_table.column_0 >= 3 or exists (select * from right_table where left_table.column_1 < right_table.column_1) Explanation of this sql is: Filter(left_table.column_0 >= 3 or `exists`) Existence_join (left + `exists`) // `exists` do not shrink or expand the rows of left table left_table right_table

  9. case class GpuBroadcastExchangeExec(mode: BroadcastMode, child: SparkPlan) extends GpuBroadcastExchangeExecBase with Product with Serializable
  10. abstract class GpuBroadcastExchangeExecBase extends Exchange with ShimBroadcastExchangeLike with ShimUnaryExecNode with GpuExec
  11. class GpuBroadcastMeta extends SparkPlanMeta[BroadcastExchangeExec]
  12. case class GpuBroadcastNestedLoopJoinExec(left: SparkPlan, right: SparkPlan, joinType: JoinType, gpuBuildSide: GpuBuildSide, condition: Option[Expression], targetSizeBytes: Long) extends SparkPlan with ShimBinaryExecNode with GpuExec with Product with Serializable
  13. class GpuBroadcastNestedLoopJoinMeta extends GpuBroadcastJoinMeta[BroadcastNestedLoopJoinExec]
  14. class GpuColumnToRowMapPartitionsRDD extends MapPartitionsRDD[InternalRow, ColumnarBatch]
  15. case class GpuCustomShuffleReaderExec(child: SparkPlan, partitionSpecs: Seq[ShufflePartitionSpec]) extends SparkPlan with ShimUnaryExecNode with GpuExec with Product with Serializable

    A wrapper of shuffle query stage, which follows the given partition arrangement.

    A wrapper of shuffle query stage, which follows the given partition arrangement.

    child

    It is usually ShuffleQueryStageExec, but can be the shuffle exchange node during canonicalization.

    partitionSpecs

    The partition specs that defines the arrangement.

  16. trait GpuHashJoin extends SparkPlan with GpuExec
  17. abstract class GpuShuffleExchangeExecBase extends Exchange with ShimUnaryExecNode with GpuExec

    Performs a shuffle that will result in the desired partitioning.

  18. abstract class GpuShuffleExchangeExecBaseWithMetrics extends GpuShuffleExchangeExecBase

    Performs a shuffle that will result in the desired partitioning.

  19. class GpuShuffleMeta extends SparkPlanMeta[ShuffleExchangeExec]
  20. case class GpuSubqueryBroadcastExec(name: String, index: Int, buildKeys: Seq[Expression], child: SparkPlan)(modeKeys: Option[Seq[Expression]]) extends BaseSubqueryExec with GpuExec with ShimUnaryExecNode with Product with Serializable
  21. class GpuSubqueryBroadcastMeta extends SparkPlanMeta[SubqueryBroadcastExec]
  22. class HashJoinIterator extends BaseHashJoinIterator

    An iterator that does a hash join against a stream of batches.

  23. class HashedExistenceJoinIterator extends ExistenceJoinIterator
  24. class SerializeBatchDeserializeHostBuffer extends Serializable with AutoCloseable with Arm
    Annotations
    @SerialVersionUID()
  25. class SerializeConcatHostBuffersDeserializeBatch extends Serializable with Arm with AutoCloseable
    Annotations
    @SerialVersionUID()
  26. class ShuffledBatchRDD extends RDD[ColumnarBatch]

    This is a specialized version of org.apache.spark.rdd.ShuffledRDD that is optimized for shuffling ColumnarBatch instead of Java key-value pairs.

    This is a specialized version of org.apache.spark.rdd.ShuffledRDD that is optimized for shuffling ColumnarBatch instead of Java key-value pairs.

    This RDD takes a ShuffleDependency (dependency), and an array of ShufflePartitionSpec as input arguments.

    The dependency has the parent RDD of this RDD, which represents the dataset before shuffle (i.e. map output). Elements of this RDD are (partitionId, Row) pairs. Partition ids should be in the range [0, numPartitions - 1]. dependency.partitioner is the original partitioner used to partition map output, and dependency.partitioner.numPartitions is the number of pre-shuffle partitions (i.e. the number of partitions of the map output).

    This code is made to try and match the Spark code as closely as possible to make maintenance simpler. Fixing compiler or IDE warnings in this code may not be ideal if the same warnings are in Spark.

  27. case class ShuffledBatchRDDPartition(index: Int, spec: ShufflePartitionSpec) extends Partition with Product with Serializable

Value Members

  1. object GpuBroadcastExchangeExecBase extends Serializable
  2. object GpuBroadcastHelper
  3. object GpuBroadcastNestedLoopJoinExec extends Arm with Serializable
  4. object GpuColumnToRowMapPartitionsRDD extends Serializable
  5. object GpuHashJoin extends Arm with Serializable
  6. object GpuShuffleExchangeExecBase extends Serializable
  7. object GpuShuffleMeta
  8. object GpuSubqueryBroadcastExec extends Serializable
  9. object InternalColumnarRddConverter extends Logging

    Please don't use this class directly use com.nvidia.spark.rapids.ColumnarRdd instead.

    Please don't use this class directly use com.nvidia.spark.rapids.ColumnarRdd instead. We had to place the implementation in a spark specific package to poke at the internals of spark more than anyone should know about.

    This provides a way to get back out GPU Columnar data RDD[Table]. Each Table will have the same schema as the dataframe passed in. If the schema of the dataframe is something that Rapids does not currently support an IllegalArgumentException will be thrown.

    The size of each table will be determined by what is producing that table but typically will be about the number of bytes set by RapidsConf.GPU_BATCH_SIZE_BYTES.

    Table is not a typical thing in an RDD so special care needs to be taken when working with it. By default it is not serializable so repartitioning the RDD or any other operator that involves a shuffle will not work. This is because it is very expensive to serialize and deserialize a GPU Table using a conventional spark shuffle. Also most of the memory associated with the Table is on the GPU itself, so each table must be closed when it is no longer needed to avoid running out of GPU memory. By convention it is the responsibility of the one consuming the data to close it when they no longer need it.

  10. object JoinTypeChecks
  11. object SerializedHostTableUtils extends Arm
  12. object ShimTrampolineUtil
  13. object TrampolineUtil

Ungrouped