Key Concepts

This section highlights several key concepts in Rockset.

Collections

A Collection is a set of Rockset documents (similar to tables in traditional SQL databases) and can be queried using SQL, either directly or using Query Lambdas (see more below).

Virtual Instances

A Virtual Instance is a set of compute resources used to process streaming ingest and queries.

Every Rockset account starts with one Virtual Instance (VI) per active region. Unless otherwise specified, all streaming ingest and queries are routed to this VI. To provide isolation between ingest and queries or isolation across different query workloads, multiple Virtual Instances can be created to leverage Rockset's compute-compute separated architecture.

Data Sources

Rockset supports ingesting data from many types of data sources:

  • Data Streams (Kafka, Kinesis)
  • OLTP Databases (DynamoDB, MongoDB, MySQL, PostgreSQL)
  • Data Lakes (S3, GCS)

As new data shows up in your data sources, it will get indexed within seconds into Rockset.

Integrations

Integrations associate authentication credentials with your data sources and provide finer-grained control over data ingestion. A given integration can be used with multiple collections.

Fully-managed integrations for a variety of data sources are currently supported, meaning that changes to your data sources will be automatically detected and replicated into Rockset in real-time.

Ingest Transformations

Ingest Transformations allow you to transform the raw input data coming from your data sources before it is loaded into your Rockset collections. Rockset's ingestion platform applies these transformations both during the initial load of a new collection's data and on an ongoing basis to new documents coming from your data source, giving you a real-time materialized view of your data.

Query Lambdas

Query Lambdas are named, parameterized SQL queries stored in Rockset that can be executed from a dedicated REST endpoint. Using query lambdas, you can save your SQL queries as separate resources in Rockset and manage them through development and production.

Aliases

Aliases are used to associate multiple names with Rockset collections. You can use the alias name in your queries in place of the actual collection name. Additionally, you can configure an alias to point to a different collection at any time without any downtime for your queries.

Views

Views are virtual collections defined by SQL queries. A view’s SQL query can reference other views, collections, aliases, or it may not reference anything. For example, SELECT 1 does not reference anything, but is still a valid query for a view. Views modularize the process of writing SQL queries.

A view does not store any data: whenever a view is queried, the defining query is executed.

Workspaces

Workspaces are containers that hold Rockset resources (ie. collections, query lambdas, views, aliases, etc.) as well as other workspaces.

Conceptually, workspaces are analogous to folders in a filesystem, while the Rockset resources they contain are analogous to files. Every Rockset resource must be part of exactly one workspace.