SQL Reference > Data Types

Data Types

This page covers the type system in Rockset.

Every value stored in Rockset is strongly typed, having one of the types below. A Rockset document itself has type object.

  • int
  • float
  • bool
  • string
  • bytes
  • null
  • array
  • object
  • date
  • datetime
  • time
  • timestamp
  • month_interval
  • microsecond_interval
  • geography

These are the type names used in type functions and in the result of the DESCRIBE command. More information regarding each type is given below.

Basic Types

Integer

int

Integers are numeric values that do not have fractional components. In Rockset, integers must be in the range betwen -2^63 and 2^63-1, inclusive.

Aliases are integer and int64.

Float

float

Floats are numeric values with fractional components. In Rockset, floats have 64-bit precision.

Aliases are double and double precision.

Boolean

bool

A boolean is either true or false.

Alias is boolean.

String

string

A string represents character data of any length.

In SQL statements, string literals should be enclosed with single quotes (e.g. 'hello'), not double quotes (which are interpreted as identifiers like field or collection names).

Aliases are char, varchar, and text.

Bytes

bytes

Values of type bytes contain arbitrary byte data of any length.

Aliases are binary and varbinary.

Null

null

We distinguish between two different forms of nulls: null and undefined. undefined is returned when selecting a field that doesn’t exist in an object, or when subscripting an array with an out-of-range index. null is returned when selecting a field whose value has been explicitly set to null.

null and undefined behave exactly the same in almost all cases. There are a couple of exceptions:

  • Functions such as typeof and json_format.
  • A predicate IS UNDEFINED returns true for undefined and false for null. IS NULL returns true for both.
  • A distinction between undefined and null is preserved in the result set and in the JSON serialization.
  • Objects where fields have a null value are not equal to objects with a missing fields, i.e. {"a": 1, "b": null} != {"a": 1}.

Most functions will return null when one of the arguments is null or undefined. Most aggregate functions will ignore both null and undefined inputs.

To emphasize the difference between null and undefined let us consider the following example.

Suppose we have a collection with the following documents:

{"name": {"first": "John", "last": "HopCroft"}},
{"name": {"first": "Robert", "last": "Tarjan"}},
{"name": {"first": "Alan", "middle": "Curtis", "last": "Kay"}},
{"name": {"first": "Edsger", "middle": null, "last": "Dijkstra"}}

In that case, the field name.middle is a null in the last document as opposed to undefined in the first 2 documents that it is absent in.

Nested Types

Rockset supports arrays and objects, which can contain any value and hence be nested recursively. The number of levels of a nesting of a field in Rockset is capped at 100. Any given level may have cardinality (keys in object or elements in array) of up to 2^32.

Array

array

An array represents an ordered list of zero or more values, each of any of Rockset type. For example, [], [1, 2, 3], and [false, 42, ['hello', 'world']] are all valid arrays.

Values inside an array foo can be accessed in SQL statements using the square bracket notation shown below.

# foo = [false, 42, ['hello', 'world']]
foo[2]     # has value 42
foo[3][1]  # has value 'hello'

In Rockset, array indexing starts at 1.

Object

object

An object represents a dictionary whose keys are strings and values are of any Rockset type. For example, {}, {'a': 10, 'b': 20}, and {'a': 10, 'b': {'c': true}} are all valid objects.

Values inside an object can be accessed in SQL statements using the dot notation shown below.

# foo = {'a': 10, 'b': {'c': true}}
foo.a      # has value 10
foo.b      # has value {'c': true}
foo.b.c    # has value true
foo."b".c  # also works, field names can be individually escaped with double quotes

Alias is map.

Date and Time Types

Refer also to the page on date and time functions for functions to construct and manipulate date and time values.

When data ingested into or returned from Rockset in JSON form, these types can be specified in a special format shown below.

{
  "_id": "foo"
  "_event_time": {"__rockset_type": "timestamp", "value": "42"}
}

Note that field _event_time is parsed not as object but rather timestamp with value 42.

Date

date

A date value represents a logical calendar date (year, month, day) independent of time zone. A date does not represent a specific time period; it can differ based on timezones. To represent an absolute point in time, use a timestamp instead.

A date literal in SQL syntax is formatted as follows.

DATE 'YYYY-[M]M-[D]D'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
DATE '2018-01-01'  # example literal
DATE(2018, 1, 1)   # constructor function

Datetime

datetime

A datetime value represents a point in time (year, month, day, hour, minute, second, microsecond). It does not refer to an absolute instance in time, unlike timestamp. Instead, it is the civil time; the time that a user would see on a watch or calendar.

A date literal in SQL syntax is formatted as follows.

DATETIME 'YYYY-[M]M-[D]D[( )[H]H:[M]M:[S]S[.DDDDDD]]'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
  • ( ): A space separator
  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD]: Up to six fractional digits
DATETIME '2018-01-01 9:30:45.456'     # example literal
DATETIME(2018, 1, 1, 9, 30, 45, 456)  # constructor function

Time

time

A time value represents the time of the day (hour, minute, second, millisecond) independent of a specific date.

A time literal in SQL syntax is formatted as follows.

TIME '[H]H:[M]M:[S]S[.DDDDDD]'

This consists of:

  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD: Up to six fractional digits
TIME '09:30:45.456'   # example literal
TIME(9, 30, 45, 456)  # constructor function

Timestamp

timestamp

A timestamp value represents absolute date and time values independent of any time zone.

A timestamp literal in SQL syntax is formatted as follows.

TIMESTAMP 'YYYY-[M]M-[D]D[( |T)[H]H:[M]M[:[S]S[.DDDDDD]]][time zone]'

This consists of:

  • YYYY: Four-digit year
  • [M]M: One or two digit month
  • [D]D: One or two digit day
  • ( ): A space separator
  • [H]H: One or two digit hour (valid values from 00 to 23)
  • [M]M: One or two digit minutes (valid values from 00 to 59)
  • [S]S: One or two digit seconds (valid values from 00 to 59)
  • [.DDDDDD]: Up to six fractional digits
  • [time zone]: Offset from Coordinated Universal Time (UTC). When a time zone is not explicitly specified, the default time zone, UTC, is used. The offset is formatted as (+|-)H[H][:M[M]], Z (synonym for UTC), or UTC. When using this format, there is a space between the time zone and the rest of the timestamp only for the UTC suffix.
TIMESTAMP '2018-01-01 09:30:45.456-05:00'  # example literal

Month Interval

month_interval

A month interval refers to a specific number of months.

As months have different lengths, month intervals may only be added to or subtracted from date or datetime values.

Examples of month intervals are shown below.

INTERVAL 3 MONTH
INTERVAL 2 YEAR
INTERVAL '2-3' YEAR TO MONTH

Microsecond Interval

microsecond_interval

A microsecond interval refers to a fixed amount of time with microsecond precision.

Microsecond intervals may be added to or subtracted from dates, times, datetimes, and timestamps. Also, you get a microsecond interval when you subtract two dates, times, datetimes, or timestamps (indicating the length of time between the two time points).

Examples of microsecond intervals are shown below.

INTERVAL 2 DAY
INTERVAL 3 HOUR
INTERVAL 5 MINUTE
INTERVAL 10 SECOND
INTERVAL 10.123 SECOND  # fractional seconds allowed
INTERVAL '2 10:23:45.56' DAY TO SECOND
INTERVAL '10:23' DAY TO MINUTE  # (etc)

Geography

geography

A geography refers to a shape on the surface of a sphere (Earth). It can be either a point (latitude and longitude), a linestring (an ordered list of connected points), or a polygon (a list of points which enclose some interior space).

All lines connecting points are considered to be geodesics - the shortest line between the two points on the sphere. For more details, see the section on spherical geodesic edges in the S2 Geometry documetation.

This is how you would create a point at 34N 12E:

SELECT
    ST_GEOGPOINT(12, 34)
{"__rockset_type":"GEOGRAPHY","value":{"type":"Point","coordinates":[34.00000000000001,12]}}

Here’s another way to construct the same point from a string in well known text format.

SELECT
    ST_GEOGFROMTEXT('POINT(12 34)')
{"__rockset_type":"GEOGRAPHY","value":{"type":"Point","coordinates":[34.00000000000001,12]}}

This example creates a linestring from 34N 12E to 78N 56E to 42N 42E:

SELECT
    ST_GEOGFROMTEXT('LINESTRING(12 34, 56 78, 42 42)')
{"__rockset_type":"GEOGRAPHY","value":{"type":"LineString","coordinates":[[34.00000000000001,12],[78,56],[42,42]]}}

Here is an example of a roughly square polygon around the west coast of Africa:

SELECT
    ST_GEOGFROMTEXT('POLYGON((0 0, 10 0, 10 10, 0 10))')
{"__rockset_type":"GEOGRAPHY","value":{"type":"Polygon","coordinates":[[[0,0],[0,10],[10.000000000000004,10],[10,0]]]}}

You can import geography types into Rockset by applying either ST_GEOGPOINT or ST_GEOGFROMTEXT in a field mapping. Refer also to the page on geographic functions for functions to construct and manipulate geographic values.