Using Virtual Instances

Streaming Ingest

Today, the main Virtual Instance that all organizations are first created with is configured as the Ingest Virtual Instance. The Ingest VI is used to process streaming ingest for all collections. You cannot configure this behavior otherwise at this time.

Dedicated Streaming Ingest

Dedicated Streaming Ingest is an optional feature for your Ingest VI which provisions dedicated tailer infrastructure and applies configurations to deliver the best performance for your latency sensitive ingest operations. To learn more about this feature see here.

Collection Mounts: Making Data Available for Queries

A collection mount represents a relationship between a Collection (storage) and a Virtual Instance (compute). Queries routed to a particular Virtual Instance can only access a given collection if that collection is mounted to the Virtual Instance. Collection mounts do not incur any additional storage costs, but they can incur additional CPU overhead on the Virtual Instance to which they are mounted.

Mounting and unmounting a collection should only take a few seconds, but sometimes it can take longer (up to one minute). You can manage collection mounts through the Virtual Instances Details page in the Rockset Console or directly through the Rockset API.

Routing Queries to Virtual Instances

In the Console Query Editor, you can select the Virtual Instance you're sending queries to:

You can also specify the Virtual Instance for queries through the Rockset API.

You can find the virtualInstanceId you need to use in the Virtual Instance details page with a copy button right next to it: Virtual Instance RRN

Copy the above ID use it with the query endpoint, as follows:

https://api.[region].rockset.com/v1/orgs/self/virtualinstances/[virtualInstanceId]/queries

Alternatively use the request body for executing Query Lambdas and include the virtualInstanceId field.

🚧

Default Query Execution

If no Virtual Instance is specified, a query will be routed to the Default Query Virtual Instance.

The main VI is configured to be the Default Query VI when creating an organization by default. To configure a separate Query VI to be the Default Query VI, see our Default Query Virtual Instance documentation.

Virtual Instance Configuration

Virtual Instances have additional configurations for advanced usage. The default values should be appropriate in most circumstances.

πŸ’‘

Configuring via Console vs. API

The following can be configured in the console upon Virtual Instance creation and updated at any time under Additional Configuration in the Virtual Instances tab of the Rockset Console.

They can also be configured through the Rockset API via the Create Virtual Instance endpoint and updated at any time using the Update Virtual Instance endpoint.

Suspending, Resuming & Auto-Suspend

Virtual Instances can be suspended and resumed. Note that suspending a VI will unmount all collections and collections will not be automatically remounted when the VI is resumed unless the Remount collections on resume configuration is enabled for the VI.

You can configure an Auto-Suspend policy for a Virtual Instance to automatically suspend the VI if it is not queried for a specified amount of time. When suspended, you will no longer be charged for running the VI. By default, Virtual Instances created via the console will auto-suspend after 1 hour.

Remount Collections on Resume

If the Remount collections on resume setting is enabled, suspending a Virtual Instance will not unmount all collections. Instead, the collections will be suspended and remounted when the Virtual Instance is resumed. By default, remount on resume is enabled when creating a VI in the console. This can be configured via the Rockset API by setting enable_remount_on_resume to true using the Create Virtual Instance endpoint.

You may want to disable this setting if you do not want to incur the CPU overhead of mounting collections that are unused.

Mount Types

Rockset supports two different mount types (Live and Static) that determine if the mounted collection will receive any updates from the underlying collection:

  • Live mounted collections will stay up-to-date with the underlying collection in real time. They will incur a small CPU and memory overhead required to process streaming updates. Live mounts are the default and should be used in most situations.
    • The processing of streaming updates for the live mounted collections is called tailing, and the part of memory that handles this is called the tailing buffer on the Query Virtual Instance and called the ingest buffer on the Ingest Virtual Instance. The tailing buffer replicates the streaming updates from the ingest buffer. As a consequence, it requires more memory to mount a collection with high ingest load.
    • If the tailing buffer reaches its limit (24% of the total memory), the Query Virtual Instance will stop tailing. We generally recommend the memory of your Query Virtual Instance be no less than 25% of the memory of your Ingest Virtual Instance to avoid this from happening. This means that for the same instance class, the Query Virtual Instance should be no less than 2 sizes smaller than the Ingest Virtual Instance. You can use the Metrics Endpoint to monitor when tailing has stopped and tailing buffer usage.
      • To resume tailing again, increase the size of your Query Virtual Instance, decrease your ingest workload, or unmount some collections. The VI will attempt to resume tailing upon a VI switch or once every 20 minutes. However, if there is still insufficient memory, the VI will stop tailing again.
  • Static mounted collections will not receive updates from the underlying collection. They can be used for queries where you don't expect the collection data to change. Static mounts will not incur the same CPU or memory overhead, since they do not process any updates.

Auto-Scaling (Beta)

Set a CPU-based vertical auto-scaling policy on your main VI to automatically adjust Virtual Instance size according to current workload demands. Learn more about this feature on our Virtual Instance Auto-Scaling documentation.

Managing Virtual Instances

Monitoring

You can find metrics specific to each Virtual Instance in the Console. The main Metrics tab has a 'Virtual Instance' selector, and each Virtual Instance Details page will also include some metrics.

The Metrics Endpoint will include Virtual Instance as a label in all compute-related metrics by default - allowing you to monitor each Virtual Instance independently as needed.

Access Management

You can scope privileges to specific Virtual Instances to restrict access to them. The following privileges can be given:

  • Create: allows creating new Virtual Instances
  • Query: allows sending queries to a particular VI
  • Update: allows upsizing and downsizing as well as mounting/unmounting collections for a particular VI
  • Suspend/Resume: allows suspending and resuming a particular VI
  • Delete: allows deletion

You can find the full Role-Based Access Control reference here.