## Metrics Endpoint

Beyond the Console Metrics page, additional metrics are accessible through the metrics endpoint in [Prometheus/OpenMetrics format](🔗). This format is compatible with monitoring/alerting tools such as Prometheus, Datadog and AWS Cloudwatch (among many others).




You can enable the metrics endpoint for your [<<glossary:Virtual Instance>>](🔗) from the [Metrics tab in the Rockset Console](🔗).

You can read more about the three metric types currently used here:

  • [Counter](🔗)

  • [Gauge](🔗)

  • [Histogram](🔗)

Some metric types (e.g. Histogram) are represented through a set of sub-items.

For example, the `rockset_query_latency_seconds` metric (a Histogram) would be represented by several `rockset_query_latency_seconds_bucket` records along with a `rockset_query_latency_seconds_sum`.

Most monitoring clients will handle these complex types automatically on your behalf.

The following metrics are provided and updated at one-minute intervals:

### Virtual Instance Metrics

MetricTypeDescription
`rockset_leaf_cpu_utilization_percentage`GaugeAverage CPU utilization across the leaves in a Virtual Instance. Leaf nodes store and ingest data. Leaf CPU utilization reflects both data ingestion and query processing.
`rockset_leaf_memory_utilization_percentage`GaugeAverage memory utilization across the leaves in a Virtual Instance. Leaf nodes store and ingest data. Leaf memory utilization reflects both data ingestion and query processing.
`rockset_agg_cpu_utilization_percentage`GaugeAverage aggregator CPU utilization. Aggregator nodes aggregate data during query execution.
`rockset_agg_memory_utilization_percentage`GaugeAverage aggregator memory utilization. Aggregator nodes aggregate data during query execution.

Virtual Instance metrics are useful for monitoring compute usage and alerting when your VI is near the limits of its performance. Query performance and ingest latency may both degrade as these metrics near 100%.

### Collection Metrics

MetricTypeDescription
`rockset_collections`GaugeNumber of collections.
`rockset_collection_size_bytes`GaugeCollection size in bytes. Note that this size reflects the current [storage size](🔗) and will decrease as documents expire via specified retention duration or are deleted.
`rockset_collection_documents`GaugeNumber of documents currently in each collection.
`rockset_collection_total_ingest_bytes`CounterNumber of bytes ingested over the history of each collection. Note that this count only ever increases and is therefore well suited for `increase` and `rate` functions to compute ingest over time.
`rockset_collection_parse_errors`CounterNumber of parse errors for each collection.
`rockset_collection_data_discovery_latency`HistogramThe duration (in seconds) from when new or updated data appears in a data source until Rockset first detects it. Elevated values for this metric often reflect configuration issues in the underlying data source (e.g. an inadequate number of RCUs provisioned for DynamoDB sources).
`rockset_collection_data_process_latency`HistogramThe duration (in seconds) from when new or updated data is first detected by Rockset until the data is fully processed and query-able. Elevated values for this metric can be alleviated by allocating additional compute to your Virtual Instance.
`rockset_data_discovery_latency`HistogramData discovery latency accross all collections. Unlike the collection-specific metric, this metric continues to include data from deleted collections.
`rockset_data_process_latency`HistogramData process latency accross all collections. Unlike the collection-specific metric, this metric continues to include data from deleted collections.

### Query Metrics

MetricTypeDescription
`rockset_queries`CounterNumber of queries.
`rockset_query_latency_seconds`HistogramQuery latency, including admission control duration. Note that this metric is exposed as a histogram — you can compute any PXX that you'd like with an accuracy of +/- ~15% in almost all cases.
`rockset_query_admission_latency_seconds`HistogramAdmission control queue duration per query if admission control is enabled for your account.
`rockset_query_queue_size`GaugeNumber of queries currently queued (throttled by admission control).
`rockset_query_errors`CounterNumber of query execution errors, labeled by HTTP error code (e.g. `404`, `500`).
`rockset_query_lambda_queries`CounterNumber of queries by Query Lambda. Note that the `tag` label is tracked if and only if the execution is [specified by tag](🔗).
`rockset_query_lambda_latency_seconds`HistogramQuery latency by Query Lambda. Note that the `tag` label is tracked if and only if the execution is [specified by tag](🔗).
`rockset_query_lambda_admission_latency_seconds`HistogramQuery admission latency by Query Lambda. Note that the `tag` label is tracked if and only if the execution is [specified by tag](🔗).
`rockset_query_lambda_errors`CounterNumber of query execution errors by Query Lambda. Note that the `tag` label is tracked if and only if the execution is [specified by tag](🔗).

### Reference Configurations & Templates

You can find reference configurations and templates for Prometheus, Datadog, Grafana and Alertmanager [here](🔗).

Below is an example of a Prometheus `scrape_configs`: