added

New Metrics in Virtual Instance page

We added new metrics and functionality to the console. You can now see a breakdown of which system processes use memory for a Virtual Instance and a new chart that shows the Cache Hit Rate. There is also a new time picker functionality and a new crosshair sync behavior between all the charts on the same page that shows the metric data for that point in time for the chart that is being hovered over.

added

Support for adding and removing sources in the console

You can now add sources to and remove sources from a collection in the console. Learn more about adding and removing sources here.

added

Support for using PARTITION BY with INSERT INTO s3

You can now split the results of a INSERT INTO s3 query such that records with the same field value will be emitted together in files prefixed with the value. Learn more here.

added

INSERT INTO s3 support for Parquet

The ability to export the query results from Rockset and write them directly to Amazon S3 in the Parquet data format. Learn more here.

added

EDIT_DISTANCE function

Used to calculate the edit distance between two strings s1 and s2 according to the algorithm specified. This new EDIT_DISTANCE function replaces UDFs, offering up to a 10x speed-up on fuzzy search queries. Learn more here.

added

Support for ingesting Parquet files with ZSTD compression

Rockset now supports Parquet files using ZSTD compression as an ingestion format, making it easier to ingest compressed data for search and analytics. Learn more here.

improved

Vectorized operations on int128 and uint256 integers

Vectorized wide integer types to more efficiently process numeric and decimal values that need to maintain precision. We recorded a 10x improvement in performance with these optimizations.

added

ARRAY_CONTAINS_PREFIX function

ARRAY_CONTAINS_PREFIX(string_array, prefix) returns true if any element of string_array starts with the given prefix.This new function Improves the ergonomics of prefix search and results in better performance on nested arrays. Learn more here.

improved

Vectorized MIN and MAX for timestamps and dates

Vectorized MIN and MAX operations for timestamps and dates for better performance of time series analytics. Internal benchmarks show a performance improvement of 82% with vectorization.

improved

SQL join optimizations

Decreasing the number of memory allocations across sequential hash computation operations, thus improving efficiency of SQL joins.