The Rockset platform is built to take full advantage of cloud infrastructure services and automated system healing using Kubernetes. This document describes a set of options that Rockset customers can choose from to achieve the right balance of high availability and cost required to meet service levels for their business critical applications.
The on-demand nature of infrastructure in the cloud makes it practical to design software where single or multiple fail-overs of different hardware, software, and data-center/region components can be handled with no or very minimal disruption or downtime. The Rockset platform takes full advantage of data redundancy and replication within and across regions.
## Data Tiers
Rockset has multiple options for ingesting data in real time, which involves storing the original data as well as the set of indexes required to deliver low latency queries on that data. Data is stored in two different places:
**Object Storage:** Data is stored in object storage for the highest data availability and durability. This layer is an encrypted repository of all documents and indexes in a customer’s collections, and it is automatically accessible across all availability zones in that region. Rockset uses Amazon S3 as the cloud object storage for all AWS deployments, and S3 is designed with eleven 9’s of durability.
**Hot Storage:** This tier is designed to compensate for the slow read performance of S3. This tier caches a copy of the data files in NVMe SSDs. In the case of a single node failure in this tier, a new node can fetch a copy of the missing data from object storage. The Hot Storage tier is local to a geo-region.
In addition to customer data, Rockset also stores metadata for every organization. This includes information about collection configuration, integration configuration, workspaces, views, query lambdas, users, API Keys, roles, etc. The metadata store is designed to survive the loss of a single region or single availability zone. This metadata is replicated across three different geographical regions and continuously backed up. Any single failure, including an entire region failure, in the metadata service will incur no downtime, interruptions, or data loss.
## Compute Tier
Queries in Rockset are served by a virtual instance (VI), which has associated compute and memory. VIs cache a working set of the data that has recently been accessed by any queries that have run in that VI. It pulls the data from the hot storage tier.
Every VI in Rockset can easily recover from one or multiple hardware failures by assigning a new pod and retrieving data quickly from the hot storage tier. There is little or no impact on requests to the Rockset service resulting from these kinds of failures.
Compute for ingesting data into Rockset uses the same mechanism to detect any single pod failure and to assign new pods to replace as a result of any transient hardware or software failures. Similarly, any such failures have no impact on the data ingested or the time it takes the data to become available for queries.
For performance reasons, all resources dedicated to a single Virtual Instance run in a single availability zone. The following section will discuss design patterns aimed at addressing any higher availability needs.
## Achieving Higher Levels of Availability
With the basic deployment options, Rockset provides fault tolerance of individual servers with little or no disruption in service. If you want to provide for higher availability in the case of availability zone or region failure, Rockset provides the following deployment options:
**Standard VI:** Represents the offering outlined above. Survive the loss of one or more nodes, but have a longer MTTR in the case of entire availability zone failure (specifically the availability zone hosting the VI). Losing a different availability zone will not impact the VI.
**Hot-Hot:** This deployment option represents a fully redundant system, including redundant ingestion. The same dataset is ingested into Rockset in multiple regions each with its own Hot Storage tier. It provides the highest availability with an MTTR of a few seconds.
**Hot-Cold [Coming Soon]\:** This deployment option leverages features in AWS S3 to replicate the object storage tier to a second region. In the case of failure of the initial region, the hot storage layer and the VIs can be restarted in the new region. Depending on data size, this type of recovery can require a few hours.
**Note:** In all multi-region cases, the customer must detect the failure and redirect traffic from their side. This may not be an issue if applications are cloud-based and running in the same regions.