Using Virtual Instances
Streaming Ingest
Today, the main
Virtual Instance that all organizations are first created with is configured as the Ingest
Virtual Instance. The Ingest
VI is used to process streaming ingest for all collections. You cannot configure this behavior otherwise at this time.
Dedicated Streaming Ingest
Dedicated Streaming Ingest is an optional feature for your Ingest
VI which provisions dedicated tailer infrastructure and applies configurations to deliver the best performance for your latency sensitive ingest operations. To learn more about this feature see here.
Collection Mounts: Making Data Available for Queries
A collection mount represents a relationship between a Collection (storage) and a Virtual Instance (compute). Queries routed to a particular Virtual Instance can only access a given collection if that collection is mounted to the Virtual Instance. Collection mounts do not incur any additional storage costs, but they can incur additional CPU overhead on the Virtual Instance to which they are mounted.
Mounting and unmounting a collection should only take a few seconds, but sometimes it can take longer (up to one minute). You can manage collection mounts through the Virtual Instances Details page in the Rockset Console or directly through the Rockset API.
Routing Queries to Virtual Instances
In the Console Query Editor, you can select the Virtual Instance you're sending queries to:
You can also specify the Virtual Instance for queries through the Rockset API.
You can find the virtualInstanceId
you need to use in the Virtual Instance details page with a copy button right next to it:
Copy the above ID use it with the query endpoint, as follows:
https://api.[region].rockset.com/v1/orgs/self/virtualinstances/[virtualInstanceId]/queries
Alternatively use the request body for executing Query Lambdas and include the virtualInstanceId
field.
Default Query Execution
If no Virtual Instance is specified, a query will be routed to the
Default Query
Virtual Instance.The
main
VI is configured to be theDefault Query
VI when creating an organization by default. To configure a separate Query VI to be theDefault Query
VI, see our Default Query Virtual Instance documentation.
Virtual Instance Configuration
Virtual Instances have additional configurations for advanced usage. The default values should be appropriate in most circumstances.
Configuring via Console vs. API
The following can be configured in the console upon Virtual Instance creation and updated at any time under Additional Configuration in the Virtual Instances tab of the Rockset Console.
They can also be configured through the Rockset API via the Create Virtual Instance endpoint and updated at any time using the Update Virtual Instance endpoint.
Suspending, Resuming & Auto-Suspend
Virtual Instances can be suspended and resumed. Note that suspending a VI will unmount all collections and collections will not be automatically remounted when the VI is resumed unless the Remount collections on resume configuration is enabled for the VI.
You can configure an Auto-Suspend policy for a Virtual Instance to automatically suspend the VI if it is not queried for a specified amount of time. When suspended, you will no longer be charged for running the VI. By default, Virtual Instances created via the console will auto-suspend after 1 hour.
Remount Collections on Resume
If the Remount collections on resume setting is enabled, suspending a Virtual Instance will not unmount all collections. Instead, the collections will be suspended and remounted when the Virtual Instance is resumed. By default, remount on resume is enabled when creating a VI in the console. This can be configured via the Rockset API by setting enable_remount_on_resume
to true
using the Create Virtual Instance endpoint.
You may want to disable this setting if you do not want to incur the CPU overhead of mounting collections that are unused.
Mount Types
Rockset supports two different mount types (Live and Static) that determine if the mounted collection will receive any updates from the underlying collection:
- Live mounted collections will stay up-to-date with the underlying collection in real time. They will incur a small CPU and memory overhead required to process streaming updates. Live mounts are the default and should be used in most situations.
- The processing of streaming updates for the live mounted collections is called tailing, and the part of memory that handles this is called the tailing buffer on the Query Virtual Instance and called the ingest buffer on the Ingest Virtual Instance. The tailing buffer replicates the streaming updates from the ingest buffer. As a consequence, it requires more memory to mount a collection with high ingest load.
- If the tailing buffer reaches its limit (24% of the total memory), the Query Virtual Instance will stop tailing. We generally recommend the memory of your Query Virtual Instance be no less than 25% of the memory of your Ingest Virtual Instance to avoid this from happening. This means that for the same instance class, the Query Virtual Instance should be no less than 2 sizes smaller than the Ingest Virtual Instance. You can use the Metrics Endpoint to monitor when tailing has stopped and tailing buffer usage.
- To resume tailing again, increase the size of your Query Virtual Instance, decrease your ingest workload, or unmount some collections. The VI will attempt to resume tailing upon a VI switch or once every 20 minutes. However, if there is still insufficient memory, the VI will stop tailing again.
- Static mounted collections will not receive updates from the underlying collection. They can be used for queries where you don't expect the collection data to change. Static mounts will not incur the same CPU or memory overhead, since they do not process any updates.
Auto-Scaling (Beta)
Set a CPU-based vertical auto-scaling policy on your main VI to automatically adjust Virtual Instance size according to current workload demands. Learn more about this feature on our Virtual Instance Auto-Scaling documentation.
Managing Virtual Instances
Monitoring
You can find metrics specific to each Virtual Instance in the Console. The main Metrics tab has a 'Virtual Instance' selector, and each Virtual Instance Details page will also include some metrics.
The Metrics Endpoint will include Virtual Instance as a label in all compute-related metrics by default - allowing you to monitor each Virtual Instance independently as needed.
Access Management
You can scope privileges to specific Virtual Instances to restrict access to them. The following privileges can be given:
- Create: allows creating new Virtual Instances
- Query: allows sending queries to a particular VI
- Update: allows upsizing and downsizing as well as mounting/unmounting collections for a particular VI
- Suspend/Resume: allows suspending and resuming a particular VI
- Delete: allows deletion
You can find the full Role-Based Access Control reference here.
Updated 5 months ago