Packages

package rapids

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class AvroBatchContext(origChunkedBlocks: LinkedHashMap[Path, ArrayBuffer[DataBlockBase]], schema: SchemaBase, mergedHeader: Header) extends BatchContext with Product with Serializable
  2. case class AvroBlockMeta(header: Header, headerSize: Long, blocks: Seq[BlockInfo]) extends Product with Serializable

    Avro block meta info

    Avro block meta info

    header

    the header of avro file

    blocks

    the total block info of avro file

  3. case class AvroDataBlock(blockInfo: BlockInfo) extends DataBlockBase with Product with Serializable

    avro BlockInfo wrapper

  4. case class AvroExtraInfo() extends ExtraInfo with Product with Serializable

    Extra information

  5. case class AvroFileFilterHandler(hadoopConf: Configuration, options: AvroOptions) extends Arm with Logging with Product with Serializable

    A tool to filter Avro blocks

  6. class AvroProviderImpl extends AvroProvider
  7. case class AvroSchemaWrapper(schema: Schema) extends SchemaBase with Product with Serializable

    avro schema wrapper

  8. case class AvroSingleDataBlockInfo(filePath: Path, dataBlock: AvroDataBlock, partitionValues: InternalRow, schema: AvroSchemaWrapper, extraInfo: AvroExtraInfo) extends SingleDataBlockInfo with Product with Serializable
  9. trait BasePad extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with NullIntolerant
  10. class BasicColumnarWriteJobStatsTracker extends ColumnarWriteJobStatsTracker

    Simple ColumnarWriteJobStatsTracker implementation that's serializable, capable of instantiating BasicColumnarWriteTaskStatsTracker on executors and processing the BasicColumnarWriteTaskStats they produce by aggregating the metrics and posting them as DriverMetricUpdates.

  11. case class BasicColumnarWriteTaskStats(numPartitions: Int, numFiles: Int, numBytes: Long, numRows: Long) extends WriteTaskStats with Product with Serializable

    Simple metrics collected during an instance of GpuFileFormatDataWriter.

    Simple metrics collected during an instance of GpuFileFormatDataWriter. These were first introduced in https://github.com/apache/spark/pull/18159 (SPARK-20703).

  12. class BasicColumnarWriteTaskStatsTracker extends ColumnarWriteTaskStatsTracker with Logging

    Simple metrics collected during an instance of GpuFileFormatDataWriter.

    Simple metrics collected during an instance of GpuFileFormatDataWriter. This is the columnar version of org.apache.spark.sql.execution.datasources.BasicWriteTaskStatsTracker.

  13. trait ColumnarWriteJobStatsTracker extends Serializable

    A class implementing this trait is basically a collection of parameters that are necessary for instantiating a (derived type of) ColumnarWriteTaskStatsTracker on all executors and then process the statistics produced by them (e.g.

    A class implementing this trait is basically a collection of parameters that are necessary for instantiating a (derived type of) ColumnarWriteTaskStatsTracker on all executors and then process the statistics produced by them (e.g. save them to memory/disk, issue warnings, etc). It is therefore important that such an objects is Serializable, as it will be sent from the driver to all executors.

  14. trait ColumnarWriteTaskStatsTracker extends AnyRef

    A trait for classes that are capable of collecting statistics on columnar data that's being processed by a single write task in GpuFileFormatDataWriter - i.e.

    A trait for classes that are capable of collecting statistics on columnar data that's being processed by a single write task in GpuFileFormatDataWriter - i.e. there should be one instance per executor.

    newPartition event is only triggered if the relation to be written out is partitioned.

  15. trait CpuToGpuAggregateBufferConverter extends AnyRef
  16. trait CpuToGpuBufferTransition extends UnaryExpression with ShimUnaryExpression with CodegenFallback
  17. class CpuToGpuCollectBufferConverter extends CpuToGpuAggregateBufferConverter
  18. case class CpuToGpuCollectBufferTransition(child: Expression, elementType: DataType) extends UnaryExpression with CpuToGpuBufferTransition with Product with Serializable
  19. trait CudfAggregate extends AnyRef
  20. abstract class CudfBinaryArithmetic extends CudfBinaryOperator with NullIntolerant
  21. abstract class CudfBinaryComparison extends CudfBinaryOperator with Predicate
  22. abstract class CudfBinaryMathExpression extends BinaryExpression with CudfBinaryExpression with Serializable with ImplicitCastInputTypes
  23. abstract class CudfBinaryPredicateWithSideEffect extends CudfBinaryOperator with Predicate
  24. class CudfCollectList extends CudfAggregate
  25. class CudfCollectSet extends CudfAggregate
  26. class CudfCount extends CudfAggregate
  27. class CudfM2 extends CudfAggregate
  28. class CudfMax extends CudfAggregate
  29. class CudfMean extends CudfAggregate

    This class is only used by the M2 class aggregates, do not confuse this with GpuAverage.

    This class is only used by the M2 class aggregates, do not confuse this with GpuAverage. In the future, this aggregate class should be removed and the mean values should be generated in the output of libcudf's M2 aggregate.

  30. class CudfMergeLists extends CudfAggregate
  31. class CudfMergeM2 extends CudfAggregate
  32. class CudfMergeSets extends CudfAggregate
  33. class CudfMin extends CudfAggregate
  34. class CudfNthLikeAggregate extends CudfAggregate
  35. class CudfSum extends CudfAggregate with Arm
  36. abstract class CudfUnaryMathExpression extends GpuUnaryMathExpression with CudfUnaryExpression
  37. case class GpuAbs(child: Expression, failOnError: Boolean) extends GpuUnaryExpression with CudfUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  38. case class GpuAcos(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  39. case class GpuAcoshCompat(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  40. case class GpuAcoshImproved(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  41. case class GpuAdd(left: Expression, right: Expression, failOnError: Boolean) extends CudfBinaryArithmetic with Product with Serializable
  42. case class GpuAggregateExpression(origAggregateFunction: GpuAggregateFunction, mode: AggregateMode, isDistinct: Boolean, filter: Option[Expression], resultId: ExprId) extends Expression with GpuExpression with ShimExpression with GpuUnevaluable with Product with Serializable
  43. trait GpuAggregateFunction extends Expression with GpuExpression with ShimExpression with GpuUnevaluable

    Trait that all aggregate functions implement.

    Trait that all aggregate functions implement.

    Aggregates start with some input from the child plan or from another aggregate (or from itself if the aggregate is merging several batches).

    In general terms an aggregate function can be in one of two modes of operation: update or merge. Either the function is aggregating raw input, or it is merging previously aggregated data. Normally, Spark breaks up the processing of the aggregate in two exec nodes (a partial aggregate and a final), and the are separated by a shuffle boundary. That is not true for all aggregates, especially when looking at other flavors of Spark. What doesn't change is the core function of updating or merging. Note that an aggregate can merge right after an update is performed, as we have cases where input batches are update-aggregated and then a bigger batch is built by merging together those pre-aggregated inputs.

    Aggregates have an interface to Spark and that is defined by aggBufferAttributes. This collection of attributes must match the Spark equivalent of the aggregate, so that if half of the aggregate (update or merge) executes on the CPU, we can be compatible. The GpuAggregateFunction adds special steps to ensure that it can produce (and consume) batches in the shape of aggBufferAttributes.

    The general transitions that are implemented in the aggregate function are as follows:

    1) inputProjection -> updateAggregates: inputProjection creates a sequence of values that are operated on by the updateAggregates. The length of inputProjection must be the same as updateAggregates, and updateAggregates (cuDF aggregates) should be able to work with the product of the inputProjection (i.e. types are compatible)

    2) updateAggregates -> postUpdate: after the cuDF update aggregate, a post process step can (optionally) be performed. The postUpdate takes the output of updateAggregate that must match the order of columns and types as specified in aggBufferAttributes.

    3) postUpdate -> preMerge: preMerge prepares batches before going into the mergeAggregate. The preMerge step binds to aggBufferAttributes, so it can be used to transform Spark compatible batch to a batch that the cuDF merge aggregate expects. Its input has the same shape as that produced by postUpdate.

    4) mergeAggregates->postMerge: postMerge optionally transforms the output of the cuDF merge aggregate in two situations: 1 - The step is used to match the aggBufferAttributes references for partial aggregates where each partially aggregated batch is getting merged with AggHelper(merge=true) 2 - In a final aggregate where the merged batches are transformed to what evaluateExpression expects. For simple aggregates like sum or count, evaluateExpression is just aggBufferAttributes, but for more complex aggregates, it is an expression (see GpuAverage and GpuM2 subclasses) that relies on the merge step producing a columns in the shape of aggBufferAttributes.

  44. case class GpuAnd(left: Expression, right: Expression) extends CudfBinaryPredicateWithSideEffect with Product with Serializable
  45. trait GpuArrayBinaryLike extends Expression with GpuComplexTypeMergingExpression with NullIntolerant
  46. case class GpuArrayContains(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with NullIntolerant with Product with Serializable

    Checks if the array (left) has the element (right)

  47. case class GpuArrayExcept(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  48. case class GpuArrayIntersect(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  49. case class GpuArrayMax(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  50. case class GpuArrayMin(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  51. case class GpuArrayRepeat(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Product with Serializable
  52. case class GpuArrayUnion(left: Expression, right: Expression) extends Expression with GpuArrayBinaryLike with ExpectsInputTypes with Product with Serializable
  53. case class GpuArraysOverlap(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  54. case class GpuArraysZip(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with ExpectsInputTypes with Product with Serializable
  55. case class GpuAsin(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  56. case class GpuAsinhCompat(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  57. case class GpuAsinhImproved(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  58. case class GpuAssembleSumChunks(chunkAttrs: Seq[AttributeReference], dataType: DecimalType, nullOnOverflow: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    Reassembles a 128-bit value from four separate 64-bit sum results

    Reassembles a 128-bit value from four separate 64-bit sum results

    chunkAttrs

    attributes for the four 64-bit sum chunks ordered from least significant to most significant

    dataType

    output type of the reconstructed 128-bit value

    nullOnOverflow

    whether to produce null on overflows

  59. case class GpuAtan(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  60. case class GpuAtanh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  61. abstract class GpuAverage extends Expression with GpuAggregateFunction with GpuReplaceWindowFunction with Serializable
  62. case class GpuAvroMultiFilePartitionReaderFactory(sqlConf: SQLConf, rapidsConf: RapidsConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, options: AvroOptions, metrics: Map[String, GpuMetric], filters: Array[Filter], queryUsesInputFile: Boolean) extends MultiFilePartitionReaderFactoryBase with Product with Serializable

    The multi-file partition reader factory for cloud or coalescing reading of avro file format.

  63. class GpuAvroPartitionReader extends FilePartitionReaderBase with GpuAvroReaderBase

    A PartitionReader that reads an AVRO file split on the GPU.

  64. case class GpuAvroPartitionReaderFactory(sqlConf: SQLConf, rapidsConf: RapidsConf, broadcastedConf: Broadcast[SerializableConfiguration], dataSchema: StructType, readDataSchema: StructType, partitionSchema: StructType, avroOptions: AvroOptions, metrics: Map[String, GpuMetric], params: Map[String, String]) extends ShimFilePartitionReaderFactory with Logging with Product with Serializable

    Avro partition reader factory to build columnar reader

  65. trait GpuAvroReaderBase extends Arm with Logging

    A trait collecting common methods across the 3 kinds of avro readers

  66. case class GpuAvroScan(sparkSession: SparkSession, fileIndex: PartitioningAwareFileIndex, dataSchema: StructType, readDataSchema: StructType, readPartitionSchema: StructType, options: CaseInsensitiveStringMap, pushedFilters: Array[Filter], rapidsConf: RapidsConf, partitionFilters: Seq[Expression] = Seq.empty, dataFilters: Seq[Expression] = Seq.empty, queryUsesInputFile: Boolean = false) extends FileScan with ScanWithMetrics with Product with Serializable
  67. case class GpuBRound(child: Expression, scale: Expression, outputType: DataType) extends GpuRoundBase with Product with Serializable
  68. case class GpuBasicAverage(child: Expression, dt: DataType) extends GpuAverage with Product with Serializable
  69. case class GpuBasicDecimalAverage(child: Expression, dt: DecimalType) extends GpuDecimalAverage with Product with Serializable
  70. case class GpuBasicDecimalSum(child: Expression, dt: DecimalType, failOnErrorOverride: Boolean) extends GpuDecimalSum with Product with Serializable

    Sum aggregations for decimals up to and including DECIMAL64

  71. case class GpuBasicSum(child: Expression, resultType: DataType, failOnErrorOverride: Boolean) extends GpuSum with Product with Serializable

    Sum aggregation for non-decimal types

  72. case class GpuBitLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  73. case class GpuBitwiseAnd(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  74. case class GpuBitwiseNot(child: Expression) extends GpuUnaryExpression with CudfUnaryExpression with ExpectsInputTypes with Product with Serializable
  75. case class GpuBitwiseOr(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  76. case class GpuBitwiseXor(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  77. class GpuCartesianPartition extends Partition
  78. case class GpuCartesianProductExec(left: SparkPlan, right: SparkPlan, condition: Option[Expression], targetSizeBytes: Long) extends SparkPlan with ShimBinaryExecNode with GpuExec with Product with Serializable
  79. class GpuCartesianRDD extends RDD[ColumnarBatch] with Serializable with Arm
  80. case class GpuCbrt(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  81. case class GpuCeil(child: Expression, outputType: DataType) extends CudfUnaryMathExpression with Product with Serializable
  82. case class GpuCheckOverflowAfterSum(data: Expression, isEmpty: Expression, dataType: DecimalType, nullOnOverflow: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    This is equivalent to what Spark does after a sum to check for overflow If(isEmpty, Literal.create(null, resultType), CheckOverflowInSum(sum, d, !SQLConf.get.ansiEnabled))

    This is equivalent to what Spark does after a sum to check for overflow If(isEmpty, Literal.create(null, resultType), CheckOverflowInSum(sum, d, !SQLConf.get.ansiEnabled))

    But we are renaming it to avoid confusion with the overflow detection we do as a part of sum itself that takes the place of the overflow checking that happens with add.

  83. trait GpuCollectBase extends Expression with GpuAggregateFunction with GpuDeterministicFirstLastCollectShim with GpuAggregateWindowFunction
  84. case class GpuCollectList(child: Expression, mutableAggBufferOffset: Int = 0, inputAggBufferOffset: Int = 0) extends Expression with GpuCollectBase with Product with Serializable

    Collects and returns a list of non-unique elements.

    Collects and returns a list of non-unique elements.

    The two 'offset' parameters are not used by GPU version, but are here for the compatibility with the CPU version and automated checks.

  85. case class GpuCollectSet(child: Expression, mutableAggBufferOffset: Int = 0, inputAggBufferOffset: Int = 0) extends Expression with GpuCollectBase with Product with Serializable

    Collects and returns a set of unique elements.

    Collects and returns a set of unique elements.

    The two 'offset' parameters are not used by GPU version, but are here for the compatibility with the CPU version and automated checks.

  86. case class GpuConcat(children: Seq[Expression]) extends Expression with GpuComplexTypeMergingExpression with Product with Serializable
  87. case class GpuConcatWs(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with ImplicitCastInputTypes with Product with Serializable
  88. case class GpuContains(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  89. case class GpuCos(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  90. case class GpuCosh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  91. case class GpuCot(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  92. case class GpuCount(children: Seq[Expression]) extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Product with Serializable
  93. case class GpuCreateArray(children: Seq[Expression], useStringTypeWhenEmpty: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  94. case class GpuCreateDataSourceTableAsSelectCommand(table: CatalogTable, mode: SaveMode, query: LogicalPlan, outputColumnNames: Seq[String], origProvider: Class[_], gpuFileFormat: ColumnarFileFormat, useStableSort: Boolean) extends LogicalPlan with GpuDataWritingCommand with Product with Serializable
  95. case class GpuCreateMap(children: Seq[Expression], useStringTypeWhenEmpty: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  96. case class GpuCreateNamedStruct(children: Seq[Expression]) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  97. case class GpuDataSource(sparkSession: SparkSession, className: String, paths: Seq[String] = Nil, userSpecifiedSchema: Option[StructType] = None, partitionColumns: Seq[String] = Seq.empty, bucketSpec: Option[BucketSpec] = None, options: Map[String, String] = Map.empty, catalogTable: Option[CatalogTable] = None, origProvider: Class[_], gpuFileFormat: ColumnarFileFormat) extends Logging with Product with Serializable

    A truncated version of Spark DataSource that converts to use the GPU version of InsertIntoHadoopFsRelationCommand for FileFormats we support.

    A truncated version of Spark DataSource that converts to use the GPU version of InsertIntoHadoopFsRelationCommand for FileFormats we support. This does not support DataSource V2 writing at this point because at the time of copying, it did not.

  98. trait GpuDataSourceScanExec extends SparkPlan with LeafExecNode with GpuExec

    GPU implementation of Spark's DataSourceScanExec

  99. case class GpuDateAdd(startDate: Expression, days: Expression) extends BinaryExpression with GpuDateMathBase with Product with Serializable
  100. case class GpuDateAddInterval(start: Expression, interval: Expression, timeZoneId: Option[String] = None, ansiEnabled: Boolean = SQLConf.get.ansiEnabled) extends GpuTimeMath with Product with Serializable
  101. case class GpuDateDiff(endDate: Expression, startDate: Expression) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with Product with Serializable
  102. case class GpuDateFormatClass(timestamp: Expression, format: Expression, strfFormat: String, timeZoneId: Option[String] = None) extends BinaryExpression with GpuBinaryExpression with TimeZoneAwareExpression with ImplicitCastInputTypes with Product with Serializable
  103. trait GpuDateMathBase extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes
  104. case class GpuDateSub(startDate: Expression, days: Expression) extends BinaryExpression with GpuDateMathBase with Product with Serializable
  105. trait GpuDateUnaryExpression extends GpuUnaryExpression with ImplicitCastInputTypes
  106. case class GpuDayOfMonth(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  107. case class GpuDayOfWeek(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  108. case class GpuDayOfYear(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  109. case class GpuDecimal128Average(child: Expression, dt: DecimalType) extends GpuDecimalAverage with Product with Serializable

    Average aggregations for DECIMAL128.

    Average aggregations for DECIMAL128.

    To avoid the significantly slower sort-based aggregations in cudf for DECIMAL128 columns, the incoming DECIMAL128 values are split into four 32-bit chunks which are summed separately into 64-bit intermediate results and then recombined into a 128-bit result with overflow checking. See GpuDecimal128Sum for more details.

  110. case class GpuDecimal128Sum(child: Expression, dt: DecimalType, failOnErrorOverride: Boolean, forceWindowSumToNotBeReplaced: Boolean) extends GpuDecimalSum with GpuReplaceWindowFunction with Product with Serializable

    Sum aggregations for DECIMAL128.

    Sum aggregations for DECIMAL128.

    The sum aggregation is performed by splitting the original 128-bit values into 32-bit "chunks" and summing those. The chunking accomplishes two things. First, it helps avoid cudf resorting to a much slower aggregation since currently DECIMAL128 sums are only implemented for sort-based aggregations. Second, chunking allows detection of overflows.

    The chunked approach to sum aggregation works as follows. The 128-bit value is split into its four 32-bit chunks, with the most significant chunk being an INT32 and the remaining three chunks being UINT32. When these are sum aggregated, cudf will implicitly upscale the accumulated result to a 64-bit value. Since cudf only allows up to 2**31 rows to be aggregated at a time, the "extra" upper 32-bits of the upscaled 64-bit accumulation values will be enough to hold the worst-case "carry" bits from summing each 32-bit chunk.

    After the cudf aggregation has completed, the four 64-bit chunks are reassembled into a 128-bit value. The lowest 32-bits of the least significant 64-bit chunk are used directly as the lowest 32-bits of the final value, and the remaining 32-bits are added to the next most significant 64-bit chunk. The lowest 32-bits of that chunk then become the next 32-bits of the 128-bit value and the remaining 32-bits are added to the next 64-bit chunk, and so on. Finally after the 128-bit value is constructed, the remaining "carry" bits of the most significant chunk after reconstruction are checked against the sign bit of the 128-bit result to see if there was an overflow.

  111. abstract class GpuDecimalAverage extends GpuAverage
  112. case class GpuDecimalDivide(left: Expression, right: Expression, dataType: DecimalType, failOnError: Boolean = SQLConf.get.ansiEnabled) extends Expression with ShimExpression with GpuExpression with Product with Serializable

    A version of Divide specifically for DecimalType that does not force the left and right to be the same type.

    A version of Divide specifically for DecimalType that does not force the left and right to be the same type. This lets us calculate the correct result on a wider range of values without the need for unbounded precision in the processing.

  113. case class GpuDecimalMultiply(left: Expression, right: Expression, dataType: DecimalType, needsExtraOverflowChecks: Boolean = false, failOnError: Boolean = SQLConf.get.ansiEnabled) extends Expression with ShimExpression with GpuExpression with Product with Serializable
  114. abstract class GpuDecimalSum extends GpuSum
  115. case class GpuDecimalSumHighDigits(input: Expression, originalInputType: DecimalType) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    This extracts the highest digits from a Decimal value as a part of doing a SUM.

  116. trait GpuDivModLike extends CudfBinaryArithmetic
  117. case class GpuDivide(left: Expression, right: Expression, failOnErrorOverride: Boolean = SQLConf.get.ansiEnabled) extends CudfBinaryArithmetic with GpuDivModLike with Product with Serializable
  118. class GpuDynamicPartitionDataWriter extends GpuFileFormatDataWriter

    Writes data to using dynamic partition writes, meaning this single function can write to multiple directories (partitions) or files (bucketing).

  119. case class GpuElementAt(left: Expression, right: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with Product with Serializable
  120. class GpuEmptyDirectoryDataWriter extends GpuFileFormatDataWriter

    GPU data writer for empty partitions

  121. case class GpuEndsWith(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  122. case class GpuEqualNullSafe(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable
  123. case class GpuEqualTo(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Equal-to.

    The table below shows how the result is calculated for Equal-to. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    Return (lhs.nan && rhs.nan) || result[i]

    +-------------+------------+------------------+---------------+----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | eq | +-------------+------------+------------------+---------------+----+ | t | f | f | r | f | | f | t | f | r | f | | t | t | f | t | t | | f | f | r | r | na | +-------------+------------+------------------+---------------+----+

  124. case class GpuEqualToNoNans(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    This implementation leverages the default implementation of equal-to on the GPU to perform the binary equals comparison.

    This implementation leverages the default implementation of equal-to on the GPU to perform the binary equals comparison. This is used for operations like PivotFirst, where NaN != NaN (unlike most other cases) when pivoting on a float or double column.

  125. case class GpuExp(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  126. case class GpuExpm1(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  127. case class GpuExtractChunk32(data: Expression, chunkIdx: Int, replaceNullsWithZero: Boolean) extends Expression with GpuExpression with ShimExpression with Product with Serializable

    Extracts a 32-bit chunk from a 128-bit value

    Extracts a 32-bit chunk from a 128-bit value

    data

    expression producing 128-bit values

    chunkIdx

    index of chunk to extract (0-3)

    replaceNullsWithZero

    whether to replace nulls with zero

  128. abstract class GpuFileFormatDataWriter extends DataWriter[ColumnarBatch]

    Abstract class for writing out data in a single Spark task using the GPU.

    Abstract class for writing out data in a single Spark task using the GPU. This is the GPU version of org.apache.spark.sql.execution.datasources.FileFormatDataWriter.

  129. case class GpuFileSourceScanExec(relation: HadoopFsRelation, output: Seq[Attribute], requiredSchema: StructType, partitionFilters: Seq[Expression], optionalBucketSet: Option[BitSet], optionalNumCoalescedBuckets: Option[Int], dataFilters: Seq[Expression], tableIdentifier: Option[TableIdentifier], disableBucketedScan: Boolean = false, queryUsesInputFile: Boolean = false)(rapidsConf: RapidsConf) extends SparkPlan with GpuDataSourceScanExec with GpuExec with Product with Serializable

    GPU version of Spark's FileSourceScanExec

    GPU version of Spark's FileSourceScanExec

    relation

    The file-based relation to scan.

    output

    Output attributes of the scan, including data attributes and partition attributes.

    requiredSchema

    Required schema of the underlying relation, excluding partition columns.

    partitionFilters

    Predicates to use for partition pruning.

    optionalBucketSet

    Bucket ids for bucket pruning.

    optionalNumCoalescedBuckets

    Number of coalesced buckets.

    dataFilters

    Filters on non-partition columns.

    tableIdentifier

    identifier for the table in the metastore.

    disableBucketedScan

    Disable bucketed scan based on physical query plan.

    queryUsesInputFile

    This is a parameter to easily allow turning it off in GpuTransitionOverrides if InputFileName, InputFileBlockStart, or InputFileBlockLength are used

    rapidsConf

    Rapids conf

  130. case class GpuFirst(child: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateFunction with GpuAggregateWindowFunction with GpuDeterministicFirstLastCollectShim with ImplicitCastInputTypes with Serializable with Product
  131. case class GpuFloor(child: Expression, outputType: DataType) extends CudfUnaryMathExpression with Product with Serializable
  132. case class GpuFromUnixTime(sec: Expression, format: Expression, strfFormat: String, timeZoneId: Option[String] = None) extends BinaryExpression with GpuBinaryExpression with TimeZoneAwareExpression with ImplicitCastInputTypes with Product with Serializable
  133. case class GpuGetArrayItem(child: Expression, ordinal: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with ExtractValue with Product with Serializable

    Returns the field at ordinal in the Array child.

    Returns the field at ordinal in the Array child.

    We need to do type checking here as ordinal expression maybe unresolved.

  134. case class GpuGetArrayStructFields(child: Expression, field: StructField, ordinal: Int, numFields: Int, containsNull: Boolean) extends GpuUnaryExpression with ExtractValue with NullIntolerant with Product with Serializable

    For a child whose data type is an array of structs, extracts the ordinal-th fields of all array elements, and returns them as a new array.

    For a child whose data type is an array of structs, extracts the ordinal-th fields of all array elements, and returns them as a new array.

    No need to do type checking since it is handled by 'ExtractValue'.

  135. class GpuGetArrayStructFieldsMeta extends UnaryExprMeta[GetArrayStructFields]
  136. case class GpuGetMapValue(child: Expression, key: Expression, failOnError: Boolean) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  137. case class GpuGetStructField(child: Expression, ordinal: Int, name: Option[String] = None) extends UnaryExpression with ShimUnaryExpression with GpuExpression with ExtractValue with NullIntolerant with Product with Serializable
  138. case class GpuGetTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  139. case class GpuGreaterThan(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for greater-than.

    The table below shows how the result is calculated for greater-than. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return (lhs.nan && !lhs.nan) || result[i]

    +-------------+------------+-----------------+---------------+----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | gt | +-------------+------------+-----------------+---------------+----+ | t | f | f | t | t | | f | t | f | r | f | | t | t | f | r | f | | f | f | r | r | na | +-------------+------------+-----------------+---------------+----+

  140. case class GpuGreaterThanOrEqual(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Greater-than-Eq.

    The table below shows how the result is calculated for Greater-than-Eq. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return lhs.isNan || result[i]

    +-------------+------------+-----------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | gte | +-------------+------------+-----------------+---------------+-----+ | t | f | f | t | t | | f | t | f | r | f | | t | t | f | t | t | | f | f | r | r | NA | +-------------+------------+-----------------+---------------+-----+

  141. case class GpuGreatest(children: Seq[Expression]) extends Expression with GpuGreatestLeastBase with Product with Serializable
  142. trait GpuGreatestLeastBase extends Expression with ComplexTypeMergingExpression with GpuExpression with ShimExpression
  143. case class GpuHour(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  144. case class GpuHypot(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  145. case class GpuInMemoryTableScanExec(attributes: Seq[Attribute], predicates: Seq[Expression], relation: InMemoryRelation) extends SparkPlan with LeafExecNode with GpuExec with Product with Serializable
  146. case class GpuInitCap(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  147. case class GpuInputFileBlockLength() extends GpuLeafExpression with Product with Serializable

    Returns the length of the block being read, or -1 if not available.

    Returns the length of the block being read, or -1 if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  148. case class GpuInputFileBlockStart() extends GpuLeafExpression with Product with Serializable

    Returns the start offset of the block being read, or -1 if not available.

    Returns the start offset of the block being read, or -1 if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  149. case class GpuInputFileName() extends GpuLeafExpression with Product with Serializable

    Returns the name of the file being read, or empty string if not available.

    Returns the name of the file being read, or empty string if not available. This is extra difficult because we cannot coalesce batches in between when this is used and the input file or else we could run into problems with returning the wrong thing.

  150. case class GpuInsertIntoHadoopFsRelationCommand(outputPath: Path, staticPartitions: TablePartitionSpec, ifPartitionNotExists: Boolean, partitionColumns: Seq[Attribute], bucketSpec: Option[BucketSpec], fileFormat: ColumnarFileFormat, options: Map[String, String], query: LogicalPlan, mode: SaveMode, catalogTable: Option[CatalogTable], fileIndex: Option[FileIndex], outputColumnNames: Seq[String], useStableSort: Boolean) extends LogicalPlan with GpuDataWritingCommand with Product with Serializable
  151. case class GpuIntegralDivide(left: Expression, right: Expression) extends CudfBinaryArithmetic with GpuDivModLike with Product with Serializable
  152. case class GpuLast(child: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateFunction with GpuAggregateWindowFunction with GpuDeterministicFirstLastCollectShim with ImplicitCastInputTypes with Serializable with Product
  153. case class GpuLastDay(startDate: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  154. case class GpuLeast(children: Seq[Expression]) extends Expression with GpuGreatestLeastBase with Product with Serializable
  155. case class GpuLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  156. case class GpuLessThan(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Less-than.

    The table below shows how the result is calculated for Less-than. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case return !lhs.nan && rhs.nan || result[i]

    +-------------+------------+-----------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | lt | +-------------+------------+-----------------+---------------+-----+ | t | f | f | r | f | | f | t | f | t | t | | t | t | f | r | f | | f | f | r | r | NA | +-------------+------------+-----------------+---------------+-----+

  157. case class GpuLessThanOrEqual(left: Expression, right: Expression) extends CudfBinaryComparison with NullIntolerant with Product with Serializable

    The table below shows how the result is calculated for Less-than-Eq.

    The table below shows how the result is calculated for Less-than-Eq. To make calculation easier we are leveraging the fact that the cudf-result(r) always returns false. So that result is used in place of false when needed.

    In this case, return rhs.nan || result[i]

    +-------------+------------+------------------+---------------+-----+ | lhs.isNan()| rhs.isNan | cudf-result(r) | final-result | lte | +-------------+------------+------------------+---------------+-----+ | t | f | f | r | f | | f | t | f | t | t | | t | t | f | t | t | | f | f | r | r | NA | +-------------+------------+------------------+---------------+-----+

  158. case class GpuLike(left: Expression, right: Expression, escapeChar: Char) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  159. case class GpuLog(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  160. case class GpuLogarithm(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  161. case class GpuLower(child: Expression) extends GpuUnaryString2StringExpression with Product with Serializable
  162. abstract class GpuM2 extends Expression with GpuAggregateFunction with ImplicitCastInputTypes with Serializable

    Base class for overriding standard deviation and variance aggregations.

    Base class for overriding standard deviation and variance aggregations. This is also a GPU-based implementation of 'CentralMomentAgg' aggregation class in Spark with the fixed 'momentOrder' variable set to '2'.

  163. case class GpuMapConcat(children: Seq[Expression]) extends Expression with GpuComplexTypeMergingExpression with Product with Serializable
  164. case class GpuMapEntries(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  165. case class GpuMapKeys(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  166. case class GpuMapValues(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  167. case class GpuMax(child: Expression) extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Product with Serializable
  168. case class GpuMd5(child: Expression) extends GpuUnaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  169. case class GpuMin(child: Expression) extends Expression with GpuAggregateFunction with GpuBatchedRunningWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Product with Serializable
  170. case class GpuMinute(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  171. case class GpuMonth(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  172. class GpuMultiFileAvroPartitionReader extends MultiFileCoalescingPartitionReaderBase with GpuAvroReaderBase

    A PartitionReader that can read multiple AVRO files up to the certain size.

    A PartitionReader that can read multiple AVRO files up to the certain size. It will coalesce small files together and copy the block data in a separate thread pool to speed up processing the small files before sending down to the GPU.

  173. class GpuMultiFileCloudAvroPartitionReader extends MultiFileCloudPartitionReaderBase with MultiFileReaderFunctions with GpuAvroReaderBase

    A PartitionReader that can read multiple AVRO files in parallel.

    A PartitionReader that can read multiple AVRO files in parallel. This is most efficient running in a cloud environment where the I/O of reading is slow.

    When reading a file, it

    • seeks to the start position of the first block located in this partition.
    • next, parses the meta and sync, rewrites the meta and sync, and copies the data to a batch buffer per block, until reaching the last one of the current partition.
    • sends batches to GPU at last.
  174. case class GpuMultiply(left: Expression, right: Expression) extends CudfBinaryArithmetic with Product with Serializable
  175. case class GpuMurmur3Hash(children: Seq[Expression], seed: Int) extends Expression with GpuExpression with ShimExpression with Product with Serializable
  176. case class GpuNormalizeNaNAndZero(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  177. case class GpuNot(child: Expression) extends GpuUnaryExpression with CudfUnaryExpression with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  178. case class GpuNthValue(child: Expression, offset: Expression, ignoreNulls: Boolean) extends Expression with GpuAggregateWindowFunction with ImplicitCastInputTypes with Serializable with Product
  179. case class GpuOctetLength(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable
  180. case class GpuOr(left: Expression, right: Expression) extends CudfBinaryPredicateWithSideEffect with Product with Serializable
  181. class GpuOrcFileFormat extends ColumnarFileFormat with Logging
  182. class GpuOrcWriter extends ColumnarOutputWriter
  183. class GpuPartitionwiseSampledRDD extends PartitionwiseSampledRDD[ColumnarBatch, ColumnarBatch]
  184. case class GpuPivotFirst(pivotColumn: Expression, valueColumn: Expression, pivotColumnValues: Seq[Any]) extends Expression with GpuAggregateFunction with Product with Serializable
  185. case class GpuPmod(left: Expression, right: Expression) extends CudfBinaryArithmetic with GpuDivModLike with Product with Serializable
  186. class GpuPoissonSampler extends PoissonSampler[ColumnarBatch] with Arm
  187. case class GpuPow(left: Expression, right: Expression) extends CudfBinaryMathExpression with Product with Serializable
  188. case class GpuPreciseTimestampConversion(child: Expression, fromType: DataType, toType: DataType) extends GpuUnaryExpression with ExpectsInputTypes with Product with Serializable

    Expression used internally to convert the TimestampType to Long and back without losing precision, i.e.

    Expression used internally to convert the TimestampType to Long and back without losing precision, i.e. in microseconds. Used in time windowing.

  189. trait GpuPredicateHelper extends AnyRef
  190. case class GpuQuarter(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  191. case class GpuRLike(left: Expression, right: Expression, pattern: String) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  192. class GpuRLikeMeta extends BinaryExprMeta[RLike]
  193. case class GpuRaiseError(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with Arm with Product with Serializable
  194. class GpuReadAvroFileFormat extends AvroFileFormat with GpuReadFileFormatWithMetrics

    A FileFormat that allows reading Avro files with the GPU.

  195. case class GpuRegExpExtract(subject: Expression, regexp: Expression, idx: Expression, cudfRegexPattern: String) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  196. case class GpuRegExpExtractAll(str: Expression, regexp: Expression, idx: Expression, numGroups: Int, cudfRegexPattern: String) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  197. class GpuRegExpExtractAllMeta extends TernaryExprMeta[RegExpExtractAll]
  198. class GpuRegExpExtractMeta extends TernaryExprMeta[RegExpExtract]
  199. case class GpuRegExpReplace(srcExpr: Expression, searchExpr: Expression, replaceExpr: Expression, javaRegexpPattern: String, cudfRegexPattern: String, cudfReplacementString: String) extends GpuRegExpTernaryBase with ImplicitCastInputTypes with Product with Serializable
  200. case class GpuRegExpReplaceWithBackref(child: Expression, cudfRegexPattern: String, cudfReplacementString: String) extends GpuUnaryExpression with ImplicitCastInputTypes with Product with Serializable
  201. abstract class GpuRegExpTernaryBase extends TernaryExpression with GpuTernaryExpression
  202. case class GpuRemainder(left: Expression, right: Expression) extends CudfBinaryArithmetic with GpuDivModLike with Product with Serializable
  203. case class GpuRint(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  204. case class GpuRound(child: Expression, scale: Expression, outputType: DataType) extends GpuRoundBase with Product with Serializable
  205. abstract class GpuRoundBase extends BinaryExpression with GpuBinaryExpression with Serializable with ImplicitCastInputTypes
  206. case class GpuRowBasedScalaUDF(sparkFunc: AnyRef, dataType: DataType, children: Seq[Expression], inputEncoders: Seq[Option[ExpressionEncoder[_]]], outputEncoder: Option[ExpressionEncoder[_]], udfName: Option[String], nullable: Boolean, udfDeterministic: Boolean) extends Expression with GpuRowBasedUserDefinedFunction with Product with Serializable
  207. case class GpuScalaUDF(function: RapidsUDF, dataType: DataType, children: Seq[Expression], udfName: Option[String], nullable: Boolean, udfDeterministic: Boolean) extends Expression with GpuUserDefinedFunction with Product with Serializable
  208. case class GpuScalarSubquery(plan: BaseSubqueryExec, exprId: ExprId) extends ExecSubqueryExpression with GpuExpression with ShimExpression with Product with Serializable

    GPU placeholder of ScalarSubquery, which returns the scalar result with columnarEval method.

    GPU placeholder of ScalarSubquery, which returns the scalar result with columnarEval method. This placeholder is to make ScalarSubquery working as a GPUExpression to cooperate other GPU overrides.

  209. case class GpuSecond(child: Expression, timeZoneId: Option[String] = None) extends GpuUnaryExpression with GpuTimeUnaryExpression with Product with Serializable
  210. case class GpuSequence(start: Expression, stop: Expression, stepOpt: Option[Expression], timeZoneId: Option[String] = None) extends Expression with TimeZoneAwareExpression with GpuExpression with ShimExpression with Product with Serializable
  211. class GpuSequenceMeta extends ExprMeta[Sequence]
  212. class GpuSerializableBatch extends Serializable with AutoCloseable with Arm
    Annotations
    @SerialVersionUID()
  213. trait GpuShiftBase extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes
  214. case class GpuShiftLeft(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  215. case class GpuShiftRight(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  216. case class GpuShiftRightUnsigned(left: Expression, right: Expression) extends BinaryExpression with GpuShiftBase with Product with Serializable
  217. abstract class GpuShuffleBlockResolverBase extends ShuffleBlockResolver with Logging
  218. class GpuShuffleDependency[K, V, C] extends ShuffleDependency[K, V, C]
  219. class GpuShuffleEnv extends Logging
  220. class GpuShuffleHandle[K, V] extends BaseShuffleHandle[K, V, V]
  221. case class GpuSignum(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  222. case class GpuSin(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  223. class GpuSingleDirectoryDataWriter extends GpuFileFormatDataWriter

    Writes data to a single directory (used for non-dynamic-partition writes).

  224. case class GpuSinh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  225. case class GpuSize(child: Expression, legacySizeOfNull: Boolean) extends GpuUnaryExpression with Product with Serializable
  226. case class GpuSortArray(base: Expression, ascendingOrder: Expression) extends BinaryExpression with GpuBinaryExpression with ExpectsInputTypes with Product with Serializable
  227. case class GpuSqrt(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  228. case class GpuStartsWith(left: Expression, right: Expression) extends BinaryExpression with GpuBinaryExpression with Predicate with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  229. case class GpuStddevPop(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  230. case class GpuStddevSamp(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with GpuReplaceWindowFunction with Product with Serializable
  231. case class GpuStringLPad(str: Expression, len: Expression, pad: Expression) extends TernaryExpression with BasePad with Product with Serializable
  232. case class GpuStringLocate(substr: Expression, col: Expression, start: Expression) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with Product with Serializable
  233. case class GpuStringRPad(str: Expression, len: Expression, pad: Expression) extends TernaryExpression with BasePad with Product with Serializable
  234. case class GpuStringRepeat(input: Expression, repeatTimes: Expression) extends BinaryExpression with GpuBinaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  235. case class GpuStringReplace(srcExpr: Expression, searchExpr: Expression, replaceExpr: Expression) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with Product with Serializable
  236. case class GpuStringSplit(str: Expression, regex: Expression, limit: Expression, pattern: String, isRegExp: Boolean) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with Product with Serializable
  237. class GpuStringSplitMeta extends StringSplitRegExpMeta[StringSplit]
  238. case class GpuStringToMap(strExpr: Expression, pairDelimExpr: Expression, keyValueDelimExpr: Expression, pairDelim: String, isPairDelimRegExp: Boolean, keyValueDelim: String, isKeyValueDelimRegExp: Boolean) extends Expression with GpuExpression with ShimExpression with ExpectsInputTypes with Product with Serializable
  239. class GpuStringToMapMeta extends StringSplitRegExpMeta[StringToMap]
  240. case class GpuStringTrim(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  241. case class GpuStringTrimLeft(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  242. case class GpuStringTrimRight(column: Expression, trimParameters: Option[Expression] = None) extends Expression with GpuString2TrimExpression with ImplicitCastInputTypes with Product with Serializable
  243. case class GpuSubstring(str: Expression, pos: Expression, len: Expression) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with NullIntolerant with Product with Serializable
  244. case class GpuSubstringIndex(strExpr: Expression, regexp: String, ignoredDelimExpr: Expression, ignoredCountExpr: Expression) extends TernaryExpression with GpuTernaryExpression with ImplicitCastInputTypes with Product with Serializable
  245. case class GpuSubtract(left: Expression, right: Expression, failOnError: Boolean) extends CudfBinaryArithmetic with Product with Serializable
  246. abstract class GpuSum extends Expression with GpuAggregateFunction with ImplicitCastInputTypes with GpuBatchedRunningWindowWithFixer with GpuAggregateWindowFunction with GpuRunningWindowFunction with Serializable
  247. case class GpuTan(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  248. case class GpuTanh(child: Expression) extends CudfUnaryMathExpression with Product with Serializable
  249. abstract class GpuTimeMath extends BinaryExpression with ShimBinaryExpression with GpuExpression with TimeZoneAwareExpression with ExpectsInputTypes with Serializable
  250. trait GpuTimeUnaryExpression extends GpuUnaryExpression with TimeZoneAwareExpression with ImplicitCastInputTypes with NullIntolerant
  251. trait GpuToCpuAggregateBufferConverter extends AnyRef
  252. trait GpuToCpuBufferTransition extends UnaryExpression with ShimUnaryExpression with CodegenFallback
  253. class GpuToCpuCollectBufferConverter extends GpuToCpuAggregateBufferConverter
  254. case class GpuToCpuCollectBufferTransition(child: Expression) extends UnaryExpression with GpuToCpuBufferTransition with Product with Serializable
  255. case class GpuToDegrees(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  256. case class GpuToRadians(child: Expression) extends GpuUnaryMathExpression with Product with Serializable
  257. abstract class GpuToTimestamp extends BinaryExpression with GpuBinaryExpression with TimeZoneAwareExpression with ExpectsInputTypes

    A direct conversion of Spark's ToTimestamp class which converts time to UNIX timestamp by first converting to microseconds and then dividing by the downScaleFactor

  258. abstract class GpuToTimestampImproved extends GpuToTimestamp

    An improved version of GpuToTimestamp conversion which converts time to UNIX timestamp without first converting to microseconds

  259. case class GpuToUnixTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  260. case class GpuToUnixTimestampImproved(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestampImproved with Product with Serializable
  261. abstract class GpuUnaryMathExpression extends GpuUnaryExpression with Serializable with ImplicitCastInputTypes
  262. case class GpuUnaryMinus(child: Expression, failOnError: Boolean) extends GpuUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  263. case class GpuUnaryPositive(child: Expression) extends GpuUnaryExpression with ExpectsInputTypes with NullIntolerant with Product with Serializable
  264. abstract class GpuUnaryString2StringExpression extends GpuUnaryExpression with ExpectsInputTypes
  265. case class GpuUnixTimestamp(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestamp with Product with Serializable
  266. case class GpuUnixTimestampImproved(strTs: Expression, format: Expression, sparkFormat: String, strf: String, timeZoneId: Option[String] = None) extends GpuToTimestampImproved with Product with Serializable
  267. case class GpuUpper(child: Expression) extends GpuUnaryString2StringExpression with Product with Serializable
  268. case class GpuVariancePop(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  269. case class GpuVarianceSamp(child: Expression, nullOnDivideByZero: Boolean) extends GpuM2 with Product with Serializable
  270. case class GpuWeekDay(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  271. class GpuWriteJobDescription extends Serializable

    A shared job description for all the GPU write tasks.

    A shared job description for all the GPU write tasks. This is the GPU version of org.apache.spark.sql.execution.datasources.WriteJobDescription.

  272. class GpuWriteJobStatsTracker extends BasicColumnarWriteJobStatsTracker

    Simple ColumnarWriteJobStatsTracker implementation that's serializable, capable of instantiating GpuWriteTaskStatsTracker on executors and processing the WriteTaskStats they produce by aggregating the metrics and posting them as DriverMetricUpdates.

  273. class GpuWriteTaskStatsTracker extends BasicColumnarWriteTaskStatsTracker

    ColumnarWriteTaskStatsTracker implementation that produces WriteTaskStats and tracks writing times per task.

  274. case class GpuWriterBucketSpec(bucketIdExpression: Expression, bucketFileNamePrefix: (Int) ⇒ String) extends Product with Serializable

    Bucketing specification for all the write tasks.

    Bucketing specification for all the write tasks. This is the GPU version of org.apache.spark.sql.execution.datasources.WriterBucketSpec

    bucketIdExpression

    Expression to calculate bucket id based on bucket column(s).

    bucketFileNamePrefix

    Prefix of output file name based on bucket id.

  275. case class GpuYear(child: Expression) extends GpuUnaryExpression with GpuDateUnaryExpression with Product with Serializable
  276. class InMemoryTableScanMeta extends SparkPlanMeta[InMemoryTableScanExec]
  277. case class ParseFormatMeta(separator: Option[Char], isTimestamp: Boolean, validRegex: String) extends Product with Serializable
  278. class ProxyRapidsShuffleInternalManagerBase extends RapidsShuffleManagerLike with Proxy

    A simple proxy wrapper allowing to delay loading of the real implementation to a later point when ShimLoader has already updated Spark classloaders.

  279. class RapidsCachingReader[K, C] extends ShuffleReader[K, C] with Arm with Logging
  280. class RapidsCachingWriter[K, V] extends ShuffleWriter[K, V] with Logging
  281. class RapidsDiskBlockManager extends AnyRef

    Maps logical blocks to local disk locations.

  282. abstract class RapidsShuffleInternalManagerBase extends ShuffleManager with RapidsShuffleHeartbeatHandler with Logging

    A shuffle manager optimized for the RAPIDS Plugin For Apache Spark.

    A shuffle manager optimized for the RAPIDS Plugin For Apache Spark.

    Note

    This is an internal class to obtain access to the private ShuffleManager and SortShuffleManager classes. When configuring Apache Spark to use the RAPIDS shuffle manager,

  283. trait RapidsShuffleManagerLike extends AnyRef

    Trait that makes it easy to check whether we are dealing with the a RAPIDS Shuffle Manager

  284. abstract class RapidsShuffleThreadedWriterBase[K, V] extends ShuffleWriter[K, V] with RapidsShuffleWriterShimHelper with Arm with Logging
  285. trait RapidsShuffleWriterShimHelper extends AnyRef
  286. case class RegexReplace(search: String, replace: String) extends Product with Serializable
  287. trait ShuffleMetricsUpdater extends AnyRef
  288. abstract class StringSplitRegExpMeta[INPUT <: TernaryExpression] extends TernaryExprMeta[INPUT]
  289. class SubstringIndexMeta extends TernaryExprMeta[SubstringIndex]
  290. case class TempSpillBufferId extends RapidsBufferId with Product with Serializable
  291. class ThreadSafeShuffleWriteMetricsReporter extends ShuffleWriteMetrics

    The ShuffleWriteMetricsReporter is based on accumulators, which are not thread safe.

    The ShuffleWriteMetricsReporter is based on accumulators, which are not thread safe. This class is a thin wrapper that adds synchronization, since these metrics will be written by multiple threads.

  292. sealed trait TimeParserPolicy extends Serializable
  293. abstract class UnixTimeExprMeta[A <: BinaryExpression with TimeZoneAwareExpression] extends BinaryExprMeta[A]
  294. case class WindowStddevSamp(child: Expression, nullOnDivideByZero: Boolean) extends Expression with GpuAggregateWindowFunction with Product with Serializable
  295. case class WrappedAggFunction(aggregateFunction: GpuAggregateFunction, filter: Expression) extends Expression with GpuAggregateFunction with Product with Serializable

Value Members

  1. object BasicColumnarWriteJobStatsTracker extends Serializable
  2. object CorrectedTimeParserPolicy extends TimeParserPolicy
  3. object CudfNthLikeAggregate
  4. object CudfRegexp
  5. object ExceptionTimeParserPolicy extends TimeParserPolicy
  6. object ExternalSource extends Logging

    The subclass of AvroProvider imports spark-avro classes.

    The subclass of AvroProvider imports spark-avro classes. This file should not imports spark-avro classes because class not found exception may throw if spark-avro does not exist at runtime. Details see: https://github.com/NVIDIA/spark-rapids/issues/5648

  7. object GpuAdd extends Arm with Serializable
  8. object GpuAnsi extends Arm
  9. object GpuAverage extends Serializable
  10. object GpuAvroScan extends Serializable
  11. object GpuCreateMap extends Arm with Serializable
  12. object GpuDataSource extends Logging with Serializable
  13. object GpuDataSourceScanExec extends Serializable
  14. object GpuDecimalDivide extends Serializable
  15. object GpuDecimalMultiply extends Arm with Serializable
  16. object GpuDecimalSumOverflow

    All decimal processing in Spark has overflow detection as a part of it.

    All decimal processing in Spark has overflow detection as a part of it. Either it replaces the value with a null in non-ANSI mode, or it throws an exception in ANSI mode. Spark will also do the processing for larger values as Decimal values which are based on BigDecimal and have unbounded precision. So in most cases it is impossible to overflow/underflow so much that an incorrect value is returned. Spark will just use more and more memory to hold the value and then check for overflow at some point when the result needs to be turned back into a 128-bit value.

    We cannot do the same thing. Instead we take three strategies to detect overflow.

    1. For decimal values with a precision of 8 or under we follow Spark and do the SUM on the unscaled value as a long, and then bit-cast the result back to a Decimal value. this means that we can SUM 174,467,442,481 maximum or minimum decimal values with a precision of 8 before overflow can no longer be detected. It is much higher for decimal values with a smaller precision. 2. For decimal values with a precision from 9 to 20 inclusive we sum them as 128-bit values. this is very similar to what we do in the first strategy. The main differences are that we use a 128-bit value when doing the sum, and we check for overflow after processing each batch. In the case of group-by and reduction that happens after the update stage and also after each merge stage. This gives us enough room that we can always detect overflow when summing a single batch. Even on a merge where we could be doing the aggregation on a batch that has all max output values in it. 3. For values from 21 to 28 inclusive we have enough room to not check for overflow on teh update aggregation, but for the merge aggregation we need to do some extra checks. This is done by taking the digits above 28 and sum them separately. We then check to see if they would have overflowed the original limits. This lets us detect overflow in cases where the original value would have wrapped around. The reason this works is because we have a hard limit on the maximum number of values in a single batch being processed. Int.MaxValue, or about 2.2 billion values. So we use a precision on the higher values that is large enough to handle 2.2 billion values and still detect overflow. This equates to a precision of about 10 more than is needed to hold the higher digits. This effectively gives us unlimited overflow detection. 4. For anything larger than precision 28 we do the same overflow detection for strategy 3, but also do it on the update aggregation. This lets us fully detect overflows in any stage of an aggregation.

    Note that for Window operations either there is no merge stage or it only has a single value being merged into a batch instead of an entire batch being merged together. This lets us handle the overflow detection with what is built into GpuAdd.

  17. object GpuDivModLike extends Arm
  18. object GpuFileFormatWriter extends Logging

    A helper object for writing columnar data out to a location.

  19. object GpuFileSourceScanExec extends Serializable
  20. object GpuFloorCeil
  21. object GpuHypot extends Arm with Serializable
  22. object GpuLogarithm extends Arm with Serializable
  23. object GpuMurmur3Hash extends Arm with Serializable
  24. object GpuOrcFileFormat extends Logging
  25. object GpuReadAvroFileFormat extends Serializable
  26. object GpuRegExpUtils
  27. object GpuScalaUDF extends Serializable
  28. object GpuScalaUDFMeta
  29. object GpuSequenceUtil extends Arm
  30. object GpuShuffleEnv extends Logging
  31. object GpuSubstringIndex extends Serializable
  32. object GpuSum extends Serializable
  33. object GpuToTimestamp extends Arm
  34. object GpuWriteJobStatsTracker extends Serializable
  35. object InputFileUtils
  36. object LegacyTimeParserPolicy extends TimeParserPolicy
  37. object PCBSSchemaHelper
  38. object RapidsShuffleInternalManagerBase extends Logging
  39. object ShiftHelper extends Arm
  40. object TempSpillBufferId extends Serializable

Ungrouped