Micro-Batching
Micro-Batching is currently in Beta. The documentation is subject to change.
Micro-Batching is a configuration that can be enabled for your Ingest
Virtual Instance. When Micro-Batching is enabled, your Ingest
Virtual Instance will automatically suspend when your ingest has “caught up” (i.e. when ingest latency is near zero) and resume on a specified interval. You may have an active Query Virtual Instance that allows you to serve a query workload on mounted collections, even while your Ingest
Virtual Instance is suspended.
Enabling Micro-Batching allows you to maximize cost efficiency and performance of Rockset by trading off cost with ingest latency as suspended Virtual Instances do not incur compute costs.
By enabling Micro-Batching, your Ingest
Virtual Instance will cyclically:
- Suspend when the average document detection latency and average document processing latency across your VI over the last 5 minutes is below 60 seconds. This indicates that your
Ingest
VI has “caught up” and is processing documents that were recently inserted into your source.- Note: Document detection latency is the time it takes for a document inserted or updated in the source to be detected by Rockset. This is different from ingest latency, which is the time it takes for a document inserted or updated in the source to be queryable. Ingest latency includes detection latency plus processing latency. You can see your ingest latency on the Metrics page of the Rockset console, as well as the detection and processing latency per collection in the "Metrics" tab on the collection details page.
- Resume on an interval specified by you. For example, if you specify a resume interval of 30 minutes, your
Ingest
VI will resume 30 minutes after it was suspended. You may also resume yourIngest
VI manually at any time.
Micro-Batching Tip
We recommend enabling Micro-Batching for your
Ingest
Virtual Instance if:
- You would like to switch to a more cost-efficient Multi-VI architecture and you don’t mind having higher ingest latency for lower cost.
- Your ingest is periodic or sporadic, and you don’t need your
Ingest
VI to immediately pick up your ingest workload.
Using Micro-Batching
Setting a Resume Interval
Consider the desired maximum tolerable ingest latency — the resume interval is correlated with this value. For example, if you select a resume interval of 60 minutes, your ingest latency should hover around 60 minutes while your Ingest
VI is catching up on ingest. Resume intervals must be between 10 minutes and 2 hours.
Note that it may take some time for your
Ingest
VI to resume (i.e. RESUMING state). This time is not included in the resume interval.
Handling Unexpectedly High or Increasing Ingest Latency
If your ingest latency is unexpectedly high, e.g. you set a resume interval of 30 min but your ingest latency maxes out at 1 hour, we may need to tune Micro-Batching to your ingest workload. Please reach out to Rockset Customer Support with details about your issue.
If your ingest latency is continuously increasing, it’s likely that your Ingest
VI is unable to keep up with the rate of ingest. In other words, your document detection latency is low, but your document processing latency is high, and you may need to scale up to a larger VI size. To avoid needing to manually scale up your Ingest
VI in these scenarios, we recommend setting an Auto-Scaling policy on your Ingest
VI. Note that your Ingest
VI will not suspend until the ingest backlog is cleared.
Multi-VI Architecture
Using Micro-Batching requires you to build a multi-VI architecture in Rockset. We recommend having at least one Query Virtual Instance that has auto-suspend disabled when Micro-Batching is enabled for your Ingest
Virtual Instance.
Limitations on Micro-Batching with
Default Query
Virtual InstancesYou cannot enable Micro-Batching on an
Ingest
VI that is also configured to be theDefault Query
VI. Similarly, you cannot set a Virtual Instance with Micro-Batching enabled to be theDefault Query
VI.
Live Mounts
Live mounts can be created while your Ingest
Virtual Instance is suspended. Both created and existing mounts will be queryable for data inserted up to the point of suspension (minus any ingest latency).
Miscellaneous
Virtual Instances must be resumed for a minimum of 10 minutes before they are re-suspended.
Restrictions
While your Ingest
VI is suspended, you will be unable to:
- Create a collection
- Ingest from IIS queries (queries will run to completion)
- Send Write API requests
- Create snapshots
- Receive updates to
_events
, ingest logs, or query logs
Additionally:
- While your
Ingest
Virtual Instance is suspended, you cannot disable Micro-Batching. You can disable Micro-Batching after resuming yourIngest
Virtual Instance. - Your
Ingest
VI will not suspend until all collections and mounts are READY and any ongoing bulk ingests and IIS queries have run to completion. - While your
Ingest
VI is suspended, you will not receive live ingest metrics in console or metrics endpoint. Ingest metrics recorded while theIngest
VI is suspended will be 0. However, metrics that were recorded while the VI was ACTIVE will still be available. - Indexes that are building will not progress until your
Ingest
VI has resumed.
Examples of Micro-Batching Configurations
Acme Corp
Ingest
VI: SMALL- Query VI(s): MEDIUM
- Ingest Rate: 1 MiB/s
- SMALL Peak Ingest Rate: 3 MiB/s
- Micro-Batching Resume Interval: 15 minutes
- Result: The ingest VI suspends for 15 minutes and resumes for about 5 minutes, processing a data backlog of 900 MiB at 3 MiB/s. This allows Acme Corp to save approximately 75% on ingest compute costs.
Stark Industries
Ingest
VI: LARGE- Query VI(s): LARGE, LARGE
- Ingest Rate: 15 MiB/s
- LARGE Peak Ingest Rate: 18 MiB/s
- Micro-Batching Resume Interval: 2 hours
- Result: After a 2 hour suspension, there is a data backlog of 108 GiB. Once resumed, the VI ingests at 18 MiB/s, and the backlog is processed over 100 minutes. Overall, Stark Industries saves about 42% on ingest compute costs.
You can find the peak ingest rates for each VI size here.
Updated 6 months ago