SQL Reference > Commands

Commands

This page documents all available SQL commands in Rockset.

In command templates shown here, all uppercase words are keywords and lowercase words are user-provided values. In practice, all keywords and function names are case insensitive; for example, describe "_events" is equivalent to DESCRIBE "_events".

DESCRIBE

DESCRIBE collection_name

Returns a list of all field paths that appear in a collection each with its corresponding data type. A * in a field path indicates an array, and can be replaced by an integer index to construct an instance of the field path. If multiple data types appear for a certain field path, DESCRIBE will return one entry for each type.

DESCRIBE _events
+-------------------------------------+--------+
| field                               | type   |
|-------------------------------------+--------|
| ['apiKeyName']                      | string |
| ['collections', '*']                | string |
| ['details', 'key']                  | string |
| ['details', 'last_queried_time_ms'] | int    |
| ['details', 'last_updated_time_ms'] | int    |
| ['details', 'limit_bytes']          | string |
| ['details', 'query_sql']            | string |
| ['details', 'query_time_ms']        | int    |
| ['details', 's3']                   | string |
| ['details', 'used_bytes']           | string |
| ['eventTime']                       | int    |
| ['integrations', '*']               | string |
| ['kind']                            | string |
| ['label']                           | string |
| ['message']                         | string |
| ['type']                            | string |
| ['userEmail']                       | string |
+-------------------------------------+--------+

SELECT

[ WITH with_query [, ...] ]
SELECT [ ALL | DISTINCT ] select_expr [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
[ HAVING condition]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select_query ]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count | ALL ] ]

Returns result set from one or more collections. May include various other clauses:

  • with_query is of the form alias_name AS select_query.
  • select_expr is a valid SQL expression.
  • from_item is one of:
    • collection_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    • from_item join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ] where join_type is one of:
      • [ INNER ] JOIN
      • LEFT [ OUTER ] JOIN
      • RIGHT [ OUTER ] JOIN
      • CROSS JOIN (which should not be followed by ON/USING, all other should)
  • condition is a SQL expression that evaluates to a boolean. See the section on conditional operators.
  • grouping_element is one of:
    • ()
    • expression
    • GROUPING SETS ( ( column [, ...] ) [, ...] )
    • CUBE ( column [, ...] )
    • ROLLUP ( column [, ...] )
  • select_query is itself a SELECT command with the same recursive format.
  • expression is a valid SQL expression.
  • count is the number of results to be returned.

Below we cover common uses some clauses in more detail.

To return all available fields, instead of listing them out as select expressions, specify a single select expression *.

SELECT * FROM _events
+--------------------------------------+---------------+--------+---------------+------+
| _id                                  | eventTime     | kind   | label         | type |
+--------------------------------------+---------------+--------+---------------+------+
| cf0e193d-9ee0-4c6a-9f2b-cab9893e97e3 | 1536164080248 | QUERY  | QUERY_SUCCESS | INFO |
| 4a5e941e-7683-4db2-b8e0-ee0e4807d6a7 | 1536164079995 | QUERY  | QUERY_SUCCESS | INFO |
| 6a9c5d49-ee3b-4b78-b653-006c540f70c3 | 1536162113586 | QUERY  | QUERY_ERROR   | INFO |
+--------------------------------------+---------------+--------+---------------+------+

To return only unique results, include DISTINCT after SELECT.

SELECT DISTINCT label FROM _events
+---------------+
| label         |
+---------------+
| QUERY_SUCCESS |
| QUERY_ERROR   |
+---------------+

WITH Clause

The WITH clause can be used to chain queries by defining a named subquery that can be then used within the main query. Using WITH clauses can improve the readablility and maintainability of complex queries.

WITH x AS (SELECT _id, label FROM _events)
SELECT _id, label FROM x
+--------------------------------------+---------------+
| _id                                  | label         |
|--------------------------------------+---------------|
| 23658547-1ae5-4be8-97e5-6e043f682c31 | QUERY_SUCCESS |
| 85e16122-d749-444d-ab63-bb5a0544368b | QUERY_SUCCESS |
| 972485f2-e3b0-43c4-bb7e-c70b7df3e86c | QUERY_SUCCESS |
+--------------------------------------+---------------+

There can be multiple subqueries in a WITH clause, and one can use any of the previous ones.

WITH
    x AS (SELECT _id, label, kind FROM _events),
    y AS (SELECT _id, kind AS c FROM x)
SELECT _id, c FROM y
+--------------------------------------+-------+
| _id                                  | c     |
|--------------------------------------+-------|
| 589eaee9-0c62-46f9-8f7a-0a5a3fcf2e2c | QUERY |
| 3e2e99b9-896f-4852-9669-ab5d1a1a6ebf | QUERY |
| a2b5d818-cad6-49b8-8f9d-bcc77146e060 | QUERY |
+--------------------------------------+-------+

FROM Clause

When querying a single collection, the FROM clause consists simply of the collection name. However, the FROM clause can also be used to reshape the data that is being queried.

UNNEST

UNNEST is a function that can be used to expand arrays of values or documents to be queried.

Suppose collection companies contains documents of the form:

{
    "make": "Ford",
    "models": ["Focus", "Mustang", "F-150"],
    "offices": [
        {"city": "Dearborn", "state": "MI"},
        {"city": "Denver", "state": "CO"}
    ]
},
{
    "make": "Toyota",
    "models": ["Prius", "Highlander", "Camry"],
    "offices": [
        {"city": "Jacksonville", "state": "FL"},
        {"city": "Ann Arbor", "state": "MI"}
    ]
}

All the values in all the models arrays can be extracted and dumped as individual rows available under an alias, as shown in the example below.

SELECT 
    models.models
FROM 
    companies,
    UNNEST(companies.models AS models) AS models
+------------+
| models     |
|------------|
| Prius      |
| Highlander |
| Camry      |
| Focus      |
| Mustang    |
| F-150      |
+------------+

The index within the original array can also be extracted by appending WITH ORDINALITY AS alias_name inside the UNNEST function.

SELECT 
    models.value, models.index
FROM
    companies,
    UNNEST(companies.models AS value WITH ORDINALITY AS index) AS models
+---------+------------+
| index   | value      |
|---------+------------|
| 1       | Prius      |
| 2       | Highlander |
| 3       | Camry      |
| 1       | Focus      |
| 2       | Mustang    |
| 3       | F-150      |
+---------+------------+

When the unnested array contains objects, its fields become available for normal querying.

SELECT 
    offices.state,
    ARRAY_AGG(offices.city) AS cities
FROM 
    companies,
    UNNEST(companies.offices AS value) AS offices
GROUP BY offices.state
+---------------------------+---------+
| cities                    | state   |
|---------------------------+---------|
| ['Denver']                | CO      |
| ['Jacksonville']          | FL      |
| ['Dearborn', 'Ann Arbor'] | MI      |
+---------------------------+---------+

JOIN

Rockset supports standard SQL joins.

  • An inner join returns only results where there is a match on both sides.
  • A left (or right) outer join returns all results from the left (or right) side, populating the other side’s columns when possible and with null otherwise.
  • A cross join returns the Cartesian product of the two sides’ results (i.e all combinations of documents in two collections).

Cross joins can either be specified using the explicit CROSS JOIN syntax or by specifying multiple collections in the FROM clause. The queries below are equivalent.

SELECT collection1.id, collection2.id
FROM collection1, collection2
SELECT collection1.id, collection2.id
FROM collection1
CROSS JOIN collection2

GROUP BY Clause

Often, interesting results are obtained by grouping records together on some similar attributes. GROUP BY is the mechanism by which the result of a SELECT is partitioned based on a particular field. Non-aggregated fields can only be included in aggregate functions in select expressions with the presense of a GROUP BY clause.

A simple GROUP BY clause may contain any expression composed of input columns.

SELECT count(*) AS c, label FROM _events GROUP BY label
+--------+------------------------------+
| c      | label                        |
|--------+------------------------------|
| 41     | API_KEY_DELETED              |
| 35     | API_KEY_CREATED              |
| 88256  | QUERY_SUCCESS                |
| 133    | QUERY_INVALID                |
| 24     | API_KEY_ERROR                |
| 72     | INGEST_WARNING               |
| 16     | COLLECTION_DROPPED           |
| 121988 | INGEST_INFO                  |
+--------+------------------------------+

HAVING is used to filter results, based on the output of aggregate functions.

SELECT count(*) AS c, label FROM _events 
GROUP BY label 
HAVING count(*) < 50
+--------+------------------------------+
| c      | label                        |
|--------+------------------------------|
| 41     | API_KEY_DELETED              |
| 35     | API_KEY_CREATED              |
| 24     | API_KEY_ERROR                |
| 16     | COLLECTION_DROPPED           |
+--------+------------------------------+

For more complex aggregations, Rockset supports GROUPING SETS, CUBE and ROLLUP. These can be used to perform aggregations on multiple fields in a single query.

GROUPING SETS

Grouping sets allow users to specify multiple lists of fields to group on. The fields that are not part of a given sublist of grouping fields are set to null. It can logically be thought of as the UNION of a single SELECT query with a simple GROUP BY for each GROUPING SET.

SELECT label, kind FROM _events
GROUP BY GROUPING SETS (
    (label),
    (label, kind)
)
+--------------+------------------------------+
| kind         | label                        |
|--------------+------------------------------|
| null         | API_KEY_DELETED              |
| API_KEY      | API_KEY_DELETED              |
| COLLECTION   | COLLECTION_CREATED           |
| null         | COLLECTION_CREATED           |
| QUERY        | QUERY_SUCCESS                |
| QUERY        | QUERY_COLLECTION_NOT_READY   |
| INGEST       | INGEST_INFO                  |
| INGEST       | INGEST_INITIALIZED           |
+--------------+------------------------------+

CUBE

CUBE is a higher level operator that can be used to generate GROUPING SETS. It generates all possible GROUPING SETS given a set of fields.

SELECT label, kind FROM _events
GROUP BY CUBE (label, kind)

This is equivalent to:

SELECT label, kind FROM _events
GROUP BY GROUPING SETS (
    (label, kind),
    (label),
    (kind),
    ()
)

ROLLUP

The ROLLUP operator is similar to CUBE in that it’s a higher order abstraction over GROUPING SETS. It generates all possible starting subsequences for a given set of fields.

SELECT label, kind FROM _events
GROUP BY ROLLUP (label, kind)

This is equivalent to:

SELECT label, kind FROM _events
GROUP BY GROUPING SETS (
    (label, kind),
    (label),
    ()
)

ORDER BY Clause

The ORDER BY clause is used to sort a result set by one or more output expressions. By default ordering for an expression is ascending. Use DESC to get result set in descending order.

SELECT * FROM users
ORDER BY user.username DESC

Use multiple expressions, each with a different ordering, by separating with commas.

SELECT * FROM users
ORDER BY users.first_name ASC, users.last_name DESC

LIMIT Clause

The LIMIT clause restricts the number of rows in the result. The following query limits the result set to contain a maximum of 10 rows. The result set represents the top 10 customers based on customer rating.

SELECT * FROM customers
LIMIT 10
ORDER BY customers.customer_rating DESC

Combining Result Sets

UNION returns the combination of results of two SELECT queries.

SELECT id, name FROM collection1
UNION
SELECT userId, name FROM collection2

INTERSECTION returns only distinct common results of two SELECT queries.

SELECT id, name FROM collection1
INTERSECTION
SELECT userId, name FROM collection2

EXCEPT returns only distinct results of the first query that are not in the second query.

SELECT id, name FROM collection1
EXCEPT
SELECT userId, name FROM collection2