Get started with Rockset and go from ingesting data to executing a query using a REST API. Using the Rockset Console, you will learn to:
It’s free to get started with Rockset. And, you can always build for free with under 2 GiB of data.
Sign up for a free 14-day Rockset trial at https://console.rockset.com/create. Part of the signup process is verifying your email address: that is a security measure required for all users who do not authenticate through Github or Google.
The verification process will take you to the Rockset console where you will create an organization. Select your instance type and provide us information on your data source and what you plan to build so we can customize your onboarding experience.
A collection in Rockset is the conceptual equivalent of a table in the relational world. Generally, you want a 1:1 mapping from your source tables/topics to Rockset collections.
We will create two collections from public datasets hosted on AWS S3. One dataset is a sample of movies and information including their genre, popularity and revenue. Another dataset is a sample of movie ratings by user. Both datasets are publically available.
Collections in the left hand navigation pane and find the
Create a New Collection button. Select sample datasets as the data source.
Give your collection a
name and optional
description. We’ll use collection names
movie_ratings for the remainder of the document. Select the dataset
movies and a source preview will automatically generate so you can explore the semi-structured JSON data in a tabular form.
You can apply transformations on incoming data, such as masking sensitive fields, and select a retention policy to automatically drop documents after a period of time.
Click to create the collection. You should now see a new collection in state
Created. It can take up to a minute for the collection to become
Ready at which point you’ll be able to explore the data and run queries against it.
You will want to repeat creating a sample dataset collection to bring
movie_ratings data into Rockset.
Once the collections are
Ready, you will enter the collection details view. In the collection details view, you can see the number of documents in the collection and the time of the last update. As new records come in, the count of documents will automatically update and you can run queries against the latest records. You can see all of the available fields, the schema that has been inferred from the collection data, as well as some additional information about the occurrences and distribution of your data. You can also inspect some sample documents in your collection.
Now we join and query these two collections using SQL. Click on the
Query Editor tab from the left hand sidebar to start writing and running SQL queries.
We constructed a query to suggest movies to a user based on their genre preference and the movie rating. Since
genre is an array field (as a single movie may fit multiple genres), we’ll need to perform an
UNNEST to expand this array and create a record for each
(genre, movie) pair. We’ll also join against the
movie_ratings table to ensure that no previously seen movies are included in this list.
For the following query, we’ll use the
Action genre and user
100 — we’ll generalize these literals in our next step.
SELECT m.id, m.title FROM commons.movies m, UNNEST(m.genres) as genres WHERE genres.name = 'Action' AND m.id NOT IN ( SELECT r.movieId FROM commons.movie_ratings r WHERE r.userId = '100' ) ORDER BY m.popularity DESC;
Use parameters to safely specify literal values in your SQL at runtime. We can add parameters to the SQL query to specify the
genre and the
userID at runtime.
You can toggle between the query results and the parameters in the query editor. Click to add a parameter.
Let’s use the value
We will use the value
We can tweak the SQL statement to incorporate the parameters we just created. Here’s the new SQL statement with the parameters:
SELECT m.id, m.title FROM commons.movies m, UNNEST(m.genres) as genres WHERE genres.name = :genre AND m.id NOT IN ( SELECT r.movieId FROM commons.movie_ratings r WHERE r.userId = :userId ) ORDER BY m.popularity DESC;
Let’s use the query with parameters to create a Query Lambda. Click
Create Query Lambda. Query Lambdas are named, parameterized SQL queries stored in Rockset that can be executed from a dedicated REST endpoint. Use Query Lambdas to build applications backed by Rockset as opposed to querying with raw SQL directly from application code.
We will use the default parameter values of
userId at runtime.
Create an API key and copy the code snippet into the application code.
Open the terminal or your script/application and copy the request to execute the query.
Invite members of your team to Rockset through the console. To invite new members, click on the left hand navigation
Users. You can determine if the new users should be administrators, members or have read-only access to Rockset.
Join us on the Slack community and share what you are looking to build with Rockset. We’re hanging out and ready to answer your questions.
Also check out some of the pages below to continue exploring Rockset: