Packages

class ParquetReadSupport extends ReadSupport[InternalRow] with Logging

A Parquet ReadSupport implementation for reading Parquet records as Catalyst InternalRows.

The API interface of ReadSupport is a little bit over complicated because of historical reasons. In older versions of parquet-mr (say 1.6.0rc3 and prior), ReadSupport need to be instantiated and initialized twice on both driver side and executor side. The init() method is for driver side initialization, while prepareForRead() is for executor side. However, starting from parquet-mr 1.6.0, it's no longer the case, and ReadSupport is only instantiated and initialized on executor side. So, theoretically, now it's totally fine to combine these two methods into a single initialization method. The only reason (I could think of) to still have them here is for parquet-mr API backwards-compatibility.

Due to this reason, we no longer rely on ReadContext to pass requested schema from init() to prepareForRead(), but use a private var for simplicity.

Linear Supertypes
Logging, ReadSupport[InternalRow], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ParquetReadSupport
  2. Logging
  3. ReadSupport
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ParquetReadSupport()
  2. new ParquetReadSupport(convertTz: Option[ZoneId], enableVectorizedReader: Boolean, datetimeRebaseSpec: RebaseSpec, int96RebaseSpec: RebaseSpec)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  6. val convertTz: Option[ZoneId]
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  11. def init(context: InitContext): ReadContext

    Called on executor side before prepareForRead() and instantiating actual Parquet record readers.

    Called on executor side before prepareForRead() and instantiating actual Parquet record readers. Responsible for figuring out Parquet requested schema used for column pruning.

    Definition Classes
    ParquetReadSupport → ReadSupport
  12. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  13. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  14. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  15. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  16. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  17. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  18. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  19. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  20. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  21. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  22. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  23. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  24. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  29. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  30. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  31. def prepareForRead(conf: Configuration, keyValueMetaData: Map[String, String], fileSchema: MessageType, readContext: ReadContext): RecordMaterializer[InternalRow]

    Called on executor side after init(), before instantiating actual Parquet record readers.

    Called on executor side after init(), before instantiating actual Parquet record readers. Responsible for instantiating RecordMaterializer, which is used for converting Parquet records to Catalyst InternalRows.

    Definition Classes
    ParquetReadSupport → ReadSupport
  32. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  33. def toString(): String
    Definition Classes
    AnyRef → Any
  34. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  36. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated
  2. def init(arg0: Configuration, arg1: Map[String, String], arg2: MessageType): ReadContext
    Definition Classes
    ReadSupport
    Annotations
    @Deprecated
    Deprecated

Inherited from Logging

Inherited from ReadSupport[InternalRow]

Inherited from AnyRef

Inherited from Any

Ungrouped