SQL Reference > Data Types

Data Types

This page covers the type system in Rockset.

Every value stored in Rockset is strongly typed, having one of the types below. A Rockset document itself has type object.

  • int
  • float
  • bool
  • string
  • bytes
  • null
  • array
  • object
  • date
  • datetime
  • time
  • timestamp
  • month_interval
  • microsecond_interval

These are the type names used in type functions and in the result of the DESCRIBE command. More information regarding each type is given below.

Basic Types

Integer

int

Integers are numeric values that do not have fractional components. In Rockset, integers must be in the range betwen -2^63 and 2^63-1, inclusive.

Aliases are integer and int64.

Float

float

Floats are numeric values with fractional components. In Rockset, floats have 64-bit precision.

Aliases are double and double precision.

Boolean

bool

A boolean is either true or false.

Alias is boolean.

String

string

A string represents character data of any length.

In SQL statements, string literals should be enclosed with single quotes (e.g. 'hello'), not double quotes (which are interpreted as identifiers like field or collection names).

Aliases are char, varchar, and text.

Bytes

bytes

Values of type bytes contain arbitrary byte data of any length.

Aliases are binary and varbinary.

Null

null
  • JSON NULL: The type of a field whose value at ingestion was defined and explicitly set to null. The typeof SQL function can be used to filter JSON NULLs. The Rockset type for JSON NULL is called null_type.

  • SQL NULL: The type of a field that is non-existent or undefined at the time of ingestion. Querying a non-existent field also returns null. The predicate IS/IS NOT NULL can be used to filter SQL NULLs. But it does not filter JSON NULLs (null_type).

To emphasize the difference between SQL NULL and JSON NULL let us consider the following example.

Suppose we have a collection with the following documents:

{"name": {"first": "John", "last": "HopCroft"}},
{"name": {"first": "Robert", "last": "Tarjan"}},
{"name": {"first": "Alan", "middle": "Curtis", "last": "Kay"}},
{"name": {"first": "Edsger", "middle": null, "last": "Dijkstra"}}

In that case, the field name.middle is a JSON NULL in the last document as opposed to a SQL NULL in the first 2 documents that it is absent in.

Nested Types

Rockset supports arrays and objects, which can contain any value and hence be nested recursively. The number of levels of a nesting of a field in Rockset is capped at 100. Any given level may have cardinality (keys in object or elements in array) of up to 2^32.

Array

array

An array represents an ordered list of zero or more values, each of any of Rockset type. For example, [], [1, 2, 3], and [false, 42, ['hello', 'world']] are all valid arrays.

Values inside an array foo can be accessed in SQL statements using the square bracket notation shown below.

# foo = [false, 42, ['hello', 'world']]
foo[2]     # has value 2
foo[3][1]  # has value 'hello'

In Rockset, array indexing starts at 1.

Object

object

An object represents a dictionary whose keys are strings and values are of any Rockset type. For example, {}, {'a': 10, 'b': 20}, and {'a': 10, 'b': {'c': true}} are all valid objects.

Values inside an object can be accessed in SQL statements using the dot notation shown below.

# foo = {'a': 10, 'b': {'c': true}}
foo.a      # has value 10
foo.b      # has value {'c': true}
foo.b.c    # has value true
foo."b".c  # also works, field names can be individually escaped with double quotes

Alias is map.

Date and Time Types

Refer also to the page on date and time functions for functions to construct and manipulate date and time values.

When data ingested into or returned from Rockset in JSON form, these types can be specified in a special format shown below.

{
  "_id": "foo"
  "_event_time": {"__rockset_type": "timestamp", "value": "42"}
}

Note that field _event_time is parsed not as object but rather timestamp with value 42.

Date

date

A date value represents a logical calendar date (year, month, day) independent of time zone. A date does not represent a specific time period; it can differ based on timezones. To represent an absolute point in time, use a timestamp instead.

A date literal in SQL syntax is formatted as follows.

DATE 'YYYY-[M]M-[D]D'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
DATE '2018-01-01'  # example literal
DATE(2018, 1, 1)   # constructor function

Datetime

datetime

A datetime value represents a point in time (year, month, day, hour, minute, second, microsecond). It does not refer to an absolute instance in time, unlike timestamp. Instead, it is the civil time; the time that a user would see on a watch or calendar.

A date literal in SQL syntax is formatted as follows.

DATETIME 'YYYY-[M]M-[D]D[( )[H]H:[M]M:[S]S[.DDDDDD]]'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
  • ( ): A space separator
  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD]: Up to six fractional digits
DATETIME '2018-01-01 9:30:45.456'     # example literal
DATETIME(2018, 1, 1, 9, 30, 45, 456)  # constructor function

Time

time

A time value represents the time of the day (hour, minute, second, millisecond) independent of a specific date.

A time literal in SQL syntax is formatted as follows.

TIME '[H]H:[M]M:[S]S[.DDDDDD]'

This consists of:

  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD: Up to six fractional digits
TIME '09:30:45.456'   # example literal
TIME(9, 30, 45, 456)  # constructor function

Timestamp

timestamp

A timestamp value represents absolute date and time values independent of any time zone.

A timestamp literal in SQL syntax is formatted as follows.

TIMESTAMP 'YYYY-[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.DDDDDD]][time zone]'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
  • ( ): A space separator
  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD]: Up to six fractional digits
  • [time zone]: Offset from Coordinated Universal Time (UTC). When a time zone is not explicitly specified, the default time zone, UTC, is used. The offset is formatted as (+|-)H[H][:M[M]], or simply Z to refer to UTC. When using this format, no space is allowed between the time zone and the rest of the timestamp.
TIMESTAMP '2018-01-01 09:30:45.456-05:00'  # example literal

Month Interval

month_interval

A month interval refers to a specific number of months.

As months have different lengths, month intervals may only be added to or subtracted from to date or datetime values.

Examples of month intervals are shown below.

INTERVAL 3 MONTH
INTERVAL 2 YEAR
INTERVAL '2-3' YEAR TO MONTH

Microsecond Interval

microsecond_interval

A microsecond interval refers to a fixed amount of time with microsecond precision.

Microsecond intervals may be added to or subtracted from dates, times, datetimes, and timestamps. Also, you get a microsecond interval when you subtract two dates, times, datetimes, or timestamps (indicating the length of time between the two time points).

Examples of microsecond intervals are shown below.

INTERVAL 2 DAY
INTERVAL 3 HOUR
INTERVAL 5 MINUTE
INTERVAL 10 SECOND
INTERVAL 10.123 SECOND  # fractional seconds allowed
INTERVAL '2 10:23:45.56' DAY TO SECOND
INTERVAL '10:23' DAY TO MINUTE  # (etc)