Packages

c

org.apache.spark.mllib.feature

ChiSqSelector

class ChiSqSelector extends Serializable

Creates a ChiSquared feature selector. The selector supports different selection methods: numTopFeatures, percentile, fpr, fdr, fwe.

  • numTopFeatures chooses a fixed number of top features according to a chi-squared test.
  • percentile is similar but chooses a fraction of all features instead of a fixed number.
  • fpr chooses all features whose p-values are below a threshold, thus controlling the false positive rate of selection.
  • fdr uses the [Benjamini-Hochberg procedure] (https://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg_procedure) to choose all features whose false discovery rate is below a threshold.
  • fwe chooses all features whose p-values are below a threshold. The threshold is scaled by 1/numFeatures, thus controlling the family-wise error rate of selection. By default, the selection method is numTopFeatures, with the default number of top features set to 50.
Annotations
@Since( "1.3.0" )
Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ChiSqSelector
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ChiSqSelector(numTopFeatures: Int)

    The is the same to call this() and setNumTopFeatures(numTopFeatures)

    The is the same to call this() and setNumTopFeatures(numTopFeatures)

    Annotations
    @Since( "1.3.0" )
  2. new ChiSqSelector()
    Annotations
    @Since( "2.1.0" )

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. var fdr: Double
  9. def fit(data: RDD[LabeledPoint]): ChiSqSelectorModel

    Returns a ChiSquared feature selector.

    Returns a ChiSquared feature selector.

    data

    an RDD[LabeledPoint] containing the labeled dataset with categorical features. Real-valued features will be treated as categorical for each distinct value. Apply feature discretizer before using this function.

    Annotations
    @Since( "1.3.0" )
  10. var fpr: Double
  11. var fwe: Double
  12. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  13. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  14. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  15. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  17. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  18. var numTopFeatures: Int
  19. var percentile: Double
  20. var selectorType: String
  21. def setFdr(value: Double): ChiSqSelector.this.type
    Annotations
    @Since( "2.2.0" )
  22. def setFpr(value: Double): ChiSqSelector.this.type
    Annotations
    @Since( "2.1.0" )
  23. def setFwe(value: Double): ChiSqSelector.this.type
    Annotations
    @Since( "2.2.0" )
  24. def setNumTopFeatures(value: Int): ChiSqSelector.this.type
    Annotations
    @Since( "1.6.0" )
  25. def setPercentile(value: Double): ChiSqSelector.this.type
    Annotations
    @Since( "2.1.0" )
  26. def setSelectorType(value: String): ChiSqSelector.this.type
    Annotations
    @Since( "2.1.0" )
  27. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  28. def toString(): String
    Definition Classes
    AnyRef → Any
  29. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  31. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped