Class DataTypes

java.lang.Object
org.apache.flink.table.api.DataTypes

@PublicEvolving public final class DataTypes extends Object
A DataType can be used to declare input and/or output types of operations. This class enumerates all pre-defined data types of the Table & SQL API.

For convenience, this class also contains methods for creating UnresolvedDataTypes that need to be resolved at later stages. This is in particular useful for more complex types that are expressed as Class (see of(Class)) or types that need to be looked up in a catalog (see of(String)).

NOTE: Planners might not support every data type with the desired precision or parameter. Please see the planner compatibility and limitations section in the website documentation before using a data type.

  • Method Details

    • of

      public static DataType of(LogicalType logicalType)
      Creates a DataType from a LogicalType with default conversion class.
      Returns:
      the LogicalType converted to a DataType.
      See Also:
    • of

      public static UnresolvedDataType of(Class<?> unresolvedClass)
      Creates an unresolved type that will be resolved to a DataType by analyzing the given class later.

      During the resolution, Java reflection is used which can be supported by DataTypeHint annotations for nested, structured types.

      It will throw an ValidationException in cases where the reflective extraction needs more information or simply fails.

      The following examples show how to use and enrich the extraction process:

      
       // returns INT
       of(Integer.class)
      
       // returns TIMESTAMP(9)
       of(java.time.LocalDateTime.class)
      
       // returns an anonymous, unregistered structured type
       // that is deeply integrated into the API compared to opaque RAW types
       class User {
      
         // extract fields automatically
         public String name;
         public int age;
      
         // enrich the extraction with precision information
         public @DataTypeHint("DECIMAL(10,2)") BigDecimal accountBalance;
      
         // enrich the extraction with forcing using RAW types
         public @DataTypeHint(forceRawPattern = "scala.") Address address;
      
         // enrich the extraction by specifying defaults
         public @DataTypeHint(defaultSecondPrecision = 3) Log log;
       }
       of(User.class)
       

      Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

    • of

      public static UnresolvedDataType of(String unresolvedName)
      Creates an unresolved type that will be resolved to a DataType by using a fully or partially defined name.

      It includes both built-in types (e.g. "INT") as well as user-defined types (e.g. "mycat.mydb.Money").

      Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

    • of

      public static UnresolvedDataType of(org.apache.flink.api.common.typeinfo.TypeInformation<?> typeInfo)
      Creates an unresolved type that will be resolved to a DataType by converting the given TypeInformation later.

      DataType is richer than TypeInformation as it also includes details about the LogicalType. Therefore, some details will be added implicitly during the conversion. The mapping to data type happens on a best effort basis. If no data type is suitable, the type information is interpreted as RAW(TypeInformation).

      See TypeInfoDataTypeConverter for more information.

      Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

    • CHAR

      public static DataType CHAR(int n)
      Data type of a fixed-length character string CHAR(n) where n is the number of code points. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
      See Also:
    • VARCHAR

      public static DataType VARCHAR(int n)
      Data type of a variable-length character string VARCHAR(n) where n is the maximum number of code points. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
      See Also:
    • STRING

      public static DataType STRING()
      Data type of a variable-length character string with defined maximum length. This is a shortcut for VARCHAR(2147483647) for representing JVM strings.
      See Also:
    • BOOLEAN

      public static DataType BOOLEAN()
      Data type of a boolean with a (possibly) three-valued logic of TRUE, FALSE, UNKNOWN.
      See Also:
    • BINARY

      public static DataType BINARY(int n)
      Data type of a fixed-length binary string (=a sequence of bytes) BINARY(n) where n is the number of bytes. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
      See Also:
    • VARBINARY

      public static DataType VARBINARY(int n)
      Data type of a variable-length binary string (=a sequence of bytes) VARBINARY(n) where n is the maximum number of bytes. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
      See Also:
    • BYTES

      public static DataType BYTES()
      Data type of a variable-length binary string (=a sequence of bytes) with defined maximum length. This is a shortcut for VARBINARY(2147483647) for representing JVM byte arrays.
      See Also:
    • DECIMAL

      public static DataType DECIMAL(int precision, int scale)
      Data type of a decimal number with fixed precision and scale DECIMAL(p, s) where p is the number of digits in a number (=precision) and s is the number of digits to the right of the decimal point in a number (=scale). p must have a value between 1 and 38 (both inclusive). s must have a value between 0 and p (both inclusive).
      See Also:
    • TINYINT

      public static DataType TINYINT()
      Data type of a 1-byte signed integer with values from -128 to 127.
      See Also:
    • SMALLINT

      public static DataType SMALLINT()
      Data type of a 2-byte signed integer with values from -32,768 to 32,767.
      See Also:
    • INT

      public static DataType INT()
      Data type of a 4-byte signed integer with values from -2,147,483,648 to 2,147,483,647.
      See Also:
    • BIGINT

      public static DataType BIGINT()
      Data type of an 8-byte signed integer with values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
      See Also:
    • FLOAT

      public static DataType FLOAT()
      Data type of a 4-byte single precision floating point number.
      See Also:
    • DOUBLE

      public static DataType DOUBLE()
      Data type of an 8-byte double precision floating point number.
      See Also:
    • DATE

      public static DataType DATE()
      Data type of a date consisting of year-month-day with values ranging from 0000-01-01 to 9999-12-31.

      Compared to the SQL standard, the range starts at year 0000.

      See Also:
    • TIME

      public static DataType TIME(int precision)
      Data type of a time WITHOUT time zone TIME(p) where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

      An instance consists of hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 00:00:00.000000000 to 23:59:59.999999999.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalTime. A time WITH time zone is not provided.

      See Also:
    • TIME

      public static DataType TIME()
      Data type of a time WITHOUT time zone TIME with no fractional seconds by default.

      An instance consists of hour:minute:second with up to second precision and values ranging from 00:00:00 to 23:59:59.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalTime. A time WITH time zone is not provided.

      See Also:
    • TIMESTAMP

      public static DataType TIMESTAMP(int precision)
      Data type of a timestamp WITHOUT time zone TIMESTAMP(p) where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

      An instance consists of year-month-day hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalDateTime.

      See Also:
    • TIMESTAMP

      public static DataType TIMESTAMP()
      Data type of a timestamp WITHOUT time zone TIMESTAMP with 6 digits of fractional seconds by default.

      An instance consists of year-month-day hour:minute:second[.fractional] with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 to 9999-12-31 23:59:59.999999.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalDateTime.

      See Also:
    • TIMESTAMP_WITH_TIME_ZONE

      public static DataType TIMESTAMP_WITH_TIME_ZONE(int precision)
      Data type of a timestamp WITH time zone TIMESTAMP(p) WITH TIME ZONE where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

      An instance consists of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

      See Also:
    • TIMESTAMP_WITH_TIME_ZONE

      public static DataType TIMESTAMP_WITH_TIME_ZONE()
      Data type of a timestamp WITH time zone TIMESTAMP WITH TIME ZONE with 6 digits of fractional seconds by default.

      An instance consists of year-month-day hour:minute:second[.fractional] zone with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 +14:59 to 9999-12-31 23:59:59.999999 -14:59.

      Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

      See Also:
    • TIMESTAMP_WITH_LOCAL_TIME_ZONE

      public static DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE(int precision)
      Data type of a timestamp WITH LOCAL time zone TIMESTAMP(p) WITH LOCAL TIME ZONE where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

      An instance consists of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59. Leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

      Compared to ZonedTimestampType, the time zone offset information is not stored physically in every datum. Instead, the type assumes Instant semantics in UTC time zone at the edges of the table ecosystem. Every datum is interpreted in the local time zone configured in the current session for computation and visualization.

      This type fills the gap between time zone free and time zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone.

      See Also:
    • TIMESTAMP_LTZ

      public static DataType TIMESTAMP_LTZ(int precision)
      Data type of a timestamp WITH LOCAL time zone. This is a synonym for TIMESTAMP_WITH_LOCAL_TIME_ZONE(int).
    • TIMESTAMP_WITH_LOCAL_TIME_ZONE

      public static DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE()
      Data type of a timestamp WITH LOCAL time zone TIMESTAMP WITH LOCAL TIME ZONE with 6 digits of fractional seconds by default.

      An instance consists of year-month-day hour:minute:second[.fractional] zone with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 +14:59 to 9999-12-31 23:59:59.999999 -14:59. Leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

      Compared to ZonedTimestampType, the time zone offset information is not stored physically in every datum. Instead, the type assumes Instant semantics in UTC time zone at the edges of the table ecosystem. Every datum is interpreted in the local time zone configured in the current session for computation and visualization.

      This type fills the gap between time zone free and time zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone.

      See Also:
    • TIMESTAMP_LTZ

      public static DataType TIMESTAMP_LTZ()
      Data type of a timestamp WITH LOCAL time zone. This is a synonym for TIMESTAMP_WITH_LOCAL_TIME_ZONE().
    • INTERVAL

      public static DataType INTERVAL(DataTypes.Resolution resolution)
      Data type of a temporal interval. There are two types of temporal intervals: day-time intervals with up to nanosecond granularity or year-month intervals with up to month granularity.

      An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from -999999 23:59:59.999999999 to +999999 23:59:59.999999999. The type must be parameterized to one of the following resolutions: interval of days, interval of days to hours, interval of days to minutes, interval of days to seconds, interval of hours, interval of hours to minutes, interval of hours to seconds, interval of minutes, interval of minutes to seconds, or interval of seconds. The value representation is the same for all types of resolutions. For example, an interval of seconds of 70 is always represented in an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000).

      An interval of year-month consists of +years-months with values ranging from -9999-11 to +9999-11. The type must be parameterized to one of the following resolutions: interval of years, interval of years to months, or interval of months. The value representation is the same for all types of resolutions. For example, an interval of months of 50 is always represented in an interval-of-years-to-months format (with default year precision): +04-02.

      Examples: INTERVAL(DAY(2)) for a day-time interval or INTERVAL(YEAR(4)) for a year-month interval.

      See Also:
    • INTERVAL

      public static DataType INTERVAL(DataTypes.Resolution upperResolution, DataTypes.Resolution lowerResolution)
      Data type of a temporal interval. There are two types of temporal intervals: day-time intervals with up to nanosecond granularity or year-month intervals with up to month granularity.

      An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from -999999 23:59:59.999999999 to +999999 23:59:59.999999999. The type must be parameterized to one of the following resolutions: interval of days, interval of days to hours, interval of days to minutes, interval of days to seconds, interval of hours, interval of hours to minutes, interval of hours to seconds, interval of minutes, interval of minutes to seconds, or interval of seconds. The value representation is the same for all types of resolutions. For example, an interval of seconds of 70 is always represented in an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000.

      An interval of year-month consists of +years-months with values ranging from -9999-11 to +9999-11. The type must be parameterized to one of the following resolutions: interval of years, interval of years to months, or interval of months. The value representation is the same for all types of resolutions. For example, an interval of months of 50 is always represented in an interval-of-years-to-months format (with default year precision): +04-02.

      Examples: INTERVAL(DAY(2), SECOND(9)) for a day-time interval or INTERVAL(YEAR(4), MONTH()) for a year-month interval.

      See Also:
    • ARRAY

      public static DataType ARRAY(DataType elementDataType)
      Data type of an array of elements with same subtype.

      Compared to the SQL standard, the maximum cardinality of an array cannot be specified but is fixed at Integer.MAX_VALUE. Also, any valid type is supported as a subtype.

      See Also:
    • ARRAY

      public static UnresolvedDataType ARRAY(AbstractDataType<?> elementDataType)
      Unresolved data type of an array of elements with same subtype.

      Compared to the SQL standard, the maximum cardinality of an array cannot be specified but is fixed at Integer.MAX_VALUE. Also, any valid type is supported as a subtype.

      Note: Compared to ARRAY(DataType), this method produces an UnresolvedDataType. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • MULTISET

      public static DataType MULTISET(DataType elementDataType)
      Data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity.

      There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.

      See Also:
    • MULTISET

      public static UnresolvedDataType MULTISET(AbstractDataType<?> elementDataType)
      Unresolved data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity.

      There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.

      Note: Compared to MULTISET(DataType), this method produces an UnresolvedDataType. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • MAP

      public static DataType MAP(DataType keyDataType, DataType valueDataType)
      Data type of an associative array that maps keys (including NULL) to values (including NULL). A map cannot contain duplicate keys; each key can map to at most one value.

      There is no restriction of key types; it is the responsibility of the user to ensure uniqueness. The map type is an extension to the SQL standard.

      See Also:
    • MAP

      public static UnresolvedDataType MAP(AbstractDataType<?> keyDataType, AbstractDataType<?> valueDataType)
      Unresolved data type of an associative array that maps keys (including NULL) to values (including NULL). A map cannot contain duplicate keys; each key can map to at most one value.

      There is no restriction of key types; it is the responsibility of the user to ensure uniqueness. The map type is an extension to the SQL standard.

      Note: Compared to MAP(DataType, DataType), this method produces an UnresolvedDataType. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • ROW

      public static DataType ROW(DataTypes.Field... fields)
      Data type of a sequence of fields. A field consists of a field name, field type, and an optional description. The most specific type of a row of a table is a row type. In this case, each column of the row corresponds to the field of the row type that has the same ordinal position as the column.

      Compared to the SQL standard, an optional field description simplifies the handling with complex structures.

      Use FIELD(String, DataType) or FIELD(String, DataType, String) to construct fields.

      See Also:
    • ROW

      public static DataType ROW(List<DataTypes.Field> fields)
      See Also:
    • ROW

      public static DataType ROW(DataType... fieldDataTypes)
      Data type of a sequence of fields.

      This is shortcut for ROW(Field...) where the field names will be generated using f0, f1, f2, ....

    • ROW

      public static DataType ROW()
      Data type of a row type with no fields. It only exists for completeness.
      See Also:
    • ROW

      public static UnresolvedDataType ROW(DataTypes.AbstractField... fields)
      Unresolved data type of a sequence of fields. A field consists of a field name, field type, and an optional description. The most specific type of a row of a table is a row type. In this case, each column of the row corresponds to the field of the row type that has the same ordinal position as the column.

      Compared to the SQL standard, an optional field description simplifies the handling with complex structures.

      Use FIELD(String, AbstractDataType) or FIELD(String, AbstractDataType, String) to construct fields.

      Note: Compared to ROW(Field...), this method produces an UnresolvedDataType with DataTypes.UnresolvedFields. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • ROW

      public static UnresolvedDataType ROW(AbstractDataType<?>... fieldDataTypes)
      Data type of a sequence of fields.

      This is shortcut for ROW(AbstractField...) where the field names will be generated using f0, f1, f2, ....

    • NULL

      public static DataType NULL()
      Data type for representing untyped NULL values. A null type has no other value except NULL, thus, it can be cast to any nullable type similar to JVM semantics.

      This type helps in representing unknown types in API calls that use a NULL literal as well as bridging to formats such as JSON or Avro that define such a type as well.

      The null type is an extension to the SQL standard.

      Note: The runtime does not support this type. It is a pure helper type during translation and planning. Table columns cannot be declared with this type. Functions cannot declare return types of this type.

      See Also:
    • RAW

      public static <T> DataType RAW(Class<T> clazz, org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
      Data type of an arbitrary serialized type. This type is a black box within the table ecosystem and is only deserialized at the edges.

      The raw type is an extension to the SQL standard.

      This method assumes that a TypeSerializer instance is present. Use RAW(Class) for automatically generating a serializer.

      Parameters:
      clazz - originating value class
      serializer - type serializer
      See Also:
    • RAW

      public static <T> UnresolvedDataType RAW(Class<T> clazz)
      Unresolved data type of an arbitrary serialized type. This type is a black box within the table ecosystem and is only deserialized at the edges.

      The raw type is an extension to the SQL standard.

      Compared to RAW(Class, TypeSerializer), this method produces an UnresolvedDataType where no serializer is known and a generic serializer should be used. During the resolution, a RAW(Class, TypeSerializer) with Flink's default RAW serializer is created and automatically configured.

      Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • RAW

      public static <T> UnresolvedDataType RAW(org.apache.flink.api.common.typeinfo.TypeInformation<T> typeInformation)
      Data type of an arbitrary serialized type backed by TypeInformation. This type is a black box within the table ecosystem and is only deserialized at the edges.

      The raw type is an extension to the SQL standard.

      Compared to RAW(Class, TypeSerializer), this method produces an UnresolvedDataType where the serializer will be generated from the enclosed TypeInformation later.

      Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      See Also:
    • STRUCTURED

      public static <T> DataType STRUCTURED(Class<T> implementationClass, DataTypes.Field... fields)
      Data type of a user-defined object structured type. Structured types contain zero, one or more attributes. Each attribute consists of a name and a type. A type cannot be defined so that one of its attribute types (transitively) uses itself.

      There are two kinds of structured types. Types that are stored in a catalog and are identified by an ObjectIdentifier or anonymously defined, unregistered types (usually reflectively extracted) that are identified by an implementation Class.

      This method helps in manually constructing anonymous, unregistered types. This is useful in cases where the reflective extraction using of(Class) is not applicable. However, of(Class) is the recommended way of creating inline structured types as it also considers DataTypeHints.

      Structured types are converted to internal data structures by the runtime. The given implementation class is only used at the edges of the table ecosystem (e.g. when bridging to a function or connector). Serialization and equality (hashCode/equals) are handled by the runtime based on the logical type. An implementation class must offer a default constructor with zero arguments or a full constructor that assigns all attributes.

      Note: A caller of this method must make sure that the DataType.getConversionClass() of the given fields matches with the attributes of the given implementation class, otherwise an exception might be thrown during runtime.

      See Also:
    • SECOND

      public static DataTypes.Resolution SECOND()
      Resolution in seconds with 6 digits for fractional seconds by default.
      See Also:
    • SECOND

      public static DataTypes.Resolution SECOND(int precision)
      Resolution in seconds and (possibly) fractional seconds. The precision is the number of digits of fractional seconds. It must have a value between 0 and 9 (both inclusive). If no fractional is specified, it is equal to 6 by default.
      See Also:
    • MINUTE

      public static DataTypes.Resolution MINUTE()
      Resolution in minutes.
    • HOUR

      public static DataTypes.Resolution HOUR()
      Resolution in hours.
    • DAY

      public static DataTypes.Resolution DAY(int precision)
      Resolution in days. The precision is the number of digits of days. It must have a value between 1 and 6 (both inclusive). If no precision is specified, it is equal to 2 by default.
      See Also:
    • DAY

      public static DataTypes.Resolution DAY()
      Resolution in days with 2 digits for the number of days by default.
      See Also:
    • MONTH

      public static DataTypes.Resolution MONTH()
      Resolution in months.
    • YEAR

      public static DataTypes.Resolution YEAR(int precision)
      Resolution in years. The precision is the number of digits of years. It must have a value between 1 and 4 (both inclusive). If no precision is specified, it is equal to 2.
      See Also:
    • YEAR

      public static DataTypes.Resolution YEAR()
      Resolution in years with 2 digits for the number of years by default.
      See Also:
    • FIELD

      public static DataTypes.Field FIELD(String name, DataType dataType)
      Field definition with field name and data type.
    • FIELD

      public static DataTypes.Field FIELD(String name, DataType dataType, String description)
      Field definition with field name, data type, and a description.
    • FIELD

      public static DataTypes.UnresolvedField FIELD(String name, AbstractDataType<?> fieldDataType)
      Unresolved field definition with field name and data type.

      Note: Compared to FIELD(String, DataType), this method produces an DataTypes.UnresolvedField that can contain an UnresolvedDataType.

    • FIELD

      public static DataTypes.UnresolvedField FIELD(String name, AbstractDataType<?> fieldDataType, String description)
      Unresolved field definition with field name, unresolved data type, and a description.

      Note: Compared to FIELD(String, DataType, String), this method produces an DataTypes.UnresolvedField that can contain an UnresolvedDataType.