Multi-Tenant Design Guide

Rockset is commonly used for multi-tenant applications where there is a need to securely and efficiently serve multiple tenants or customers simultaneously.

There are several challenges when designing for multi-tenancy addressed by this guide:

  • Resource allocation and management: Allocating and managing resources across different tenants, especially when tenants have varying demands and usage patterns that may impact other tenants. Rockset solves this by supporting multiple Virtual Instances, with different allocations of compute and memory, for different tenants or groups of tenants.
  • Data security and isolation: Ensuring isolation is crucial to prevent tenants from accessing or affecting the resources and data of another tenant. Rockset ensures data security and isolation with role based access controls and API parameters.
  • Performance variability: Due to the shared nature of resources, performance can be inconsistent. If one tenant's application experiences a sudden spike in usage, it could impact the performance of other tenants' applications. Rockset solves this with multiple, isolated virtual instances that can be used for individual tenants or groups of tenants. Rockset can also scale up and down on demand, dynamically allocating resources to handle spikes or dips in tenant usage.
  • Tenant onboarding and offboarding: Adding and removing tenants from the cluster requires careful management to ensure that resources are appropriately allocated and de-allocated. One of the big challenges in other systems is that storage and compute are tightly coupled, making it difficult to both add and remove tenants to a cluster without impacting the other tenants. Rockset separates and independently scales hot storage and compute to eliminate data movement challenges associated with tenant onboarding and offboarding. Rockset is a mutable database, which means that deleting all the data for a tenant who needs to be onboarded is as easy as issuing an update query.
  • Tenant access: Different tenants may want different levels of access to the data. Rockset solves this by creating views for querying across multiple tenants’ data while restricting data access by role or function.
  • Tenant billing: Another challenge in a multi-tenant system is to accurately and easily associate database serving cost per tenant or by a group of tenants. The Rockset usage bill has an individual line item per Virtual Instance, and allocating a tenant or a group of tenants to individual Virtual Instances automatically shows the cost of serving for that group of tenants.

We cover in this guide the following designs for multi-tenant applications:

Designing Virtual Instances For Multi-Tenant Compute Isolation

You can create Multiple Virtual Instances, or isolated compute and memory resources, for tenants to ensure predictable query performance at scale.

For example, a SaaS customer could provide different offerings including a standard, premium and dedicated offering with different performance SLAs for each while storing the data for all customers in a set of Collections that are shared across all virtual instances. Different virtual instances could be used based on the number of customers and the compute size required for performance. In this example, the virtual instances could be configured as follows:

  • Standard: L Virtual Instance - 5,000 tenants with variable performance and infrequent usage.
  • Premium: XL Virtual Instance - 300 tenants with medium performance and medium usage.
  • Dedicated: 10 L Virtual Instances - 10 large virtual instances designated for each customer to provide high performance and support high usage.

As customers increase their usage over time, moving from standard to premium offerings, all you need to do is increase the virtual instance size or move them onto their own dedicated virtual instance. No data movement, schema changes or data replication required. This makes it easy to support multi-tenant applications where usage patterns change overtime. You can also bill back large tenants for their usage.

Designing Collections For Multi-Tenant Query Patterns

You can consolidate multiple tenants’ data within a collection and use a designated field to identify which tenant owns the data (e.g.tenant_id) to restrict data access.

During the collection creation process, you can optionally set up a clustering scheme to optimize for multi-tenant applications. By clustering on tenant_id, the documents with the same tenant will be stored together in a collection. This can help improve query performance when processing data within a tenant.

Rockset’s schemaless data model is advantageous in multi-tenant design as it allows different data shapes for different tenants without any additional data modeling. For example, many SaaS applications allow their users to create their own custom fields. In Rockset, you can add another field to your dataset that is specific to a tenant_id.

To ensure that one tenant does not access another tenant’s data, you can use Rockset’s Query Lambdas with query parameters to safely specify the literal values in your SQL. We recommend creating a parameter with tenant_id and making that parameter with no default value at runtime for added security. You can then convert the parameterized query into a Query Lambda that can be executed from a dedicated REST endpoint.

Designing Collections With Views For Cross-Tenant Query Patterns

You can also combine multi-tenant collections with Views to allow for cross-tenant queries. You can use views to share particular data and SQL snippets with different teams.

For example, a company may have different teams (marketing, support, finance, etc.) all building applications on Rockset. You can create views that are specific to a team using Custom Roles.

Views enable multiple applications to be built on the same dataset while maintaining data access security and compliance.