Overview > Concepts

Concepts

Rockset uses a new type of relational document model, combined with Converged Indexing, distributed query processing and disaggregated cloud-native infrastructure.

Rockset Architecture

Serverless

Rockset is delivered as a managed service in the cloud. It abstracts away the infrastructure and operational considerations of a data management platform. It does not require you to provision machines or instances in the cloud in advance and can use cloud elasticity to handle scaling automatically for you.

Documents

A document is the basic data primitive in Rockset. In Rockset a document is composed of key-value pairs.

{
   key1: value1,
   key2: {
      nestedKey1: nestedValue1,
      nestedKey2: [arrayValue1, arrayValue2]
   },
   ...
   keyN: valueN
}

The values themselves may be complex documents consisting of other key-value pairs, arrays, or primitive values including INT, STRING, TIMESTAMP, etc. Refer to our datatypes glossary for a list of supported data types.

A Rockset document is uniquely identified by a document ID denoted as _id. If an _id not specified in then one is automatically assigned to the document at the time of insertion.

Smart Schema

Field types are auto-inferred by Rockset at the time of insert and exposed as a Smart Schema to enable relational SQL queries, without upfront data modeling or schema design. By associating field types with every occurrence of a field value, Rockset is able to support both dynamic and strong typing. This means writes are never rejected, so there is no data loss even if there are mixed types associated with a particular field.

Smart Schema

Rockset provides atomic writes at the document level - you can update multiple fields within a single document atomically. Rockset does not support atomic updates across documents.

Collections

A collection is a set of Rockset documents. All documents within a collection and all fields within a document are mutable. You can create a collection and add documents to it from various sources, and make use of our client APIs - REST, Python or Java to run queries against collections. SQL queries can also JOIN documents across different Rockset collections.

Learn more about creating Rockset collections by exploring the Data Sources section.

Workspaces

A workspace is a container that can hold collections and other workspaces. In this way, workspaces are analogous to folders whereas collections are analogous to files. Every collection must be part of exactly one workspace.

Nested workspaces are accessed by separating their names with dots (“.”), e.g. SELECT first_name, email FROM marketing.leads.web_signups will access collection web_signups in workspace marketing.leads.

If not specified, the workspace will be implicitly assumed as commons. so DESCRIBE users is equivalent to DESCRIBE commons.users.

Integrations

Data in an external data management system can be easily and quickly ingested into a collection. Integrations are a way for you to manage credentials and access mechanisms to external data sources in a secure fashion.

Learn more about creating Rockset integrations by exploring the Data Integrations section.

Architecture

Learn how Rockset’s architecture enables highly parallelized execution of complex queries across diverse data sets.

Download Whitepaper