Quickstart

Sign Up

Sign-up for a Rockset account using your GitHub account, Google account, or email address. During the course of your two-week trial, you will have $300 worth of credits to use however you see fit. No credit card required! Billing is determined by your compute (Virtual Instance size) and storage usage. After your trial ends, you can continue to use Rockset for free by using the FREE Virtual Instance and staying below the 2GB of free storage.

Pro-tip: If you’re using a larger VI for your trial, switch it back to Shared VI when you’re not actively using it to save credit! For more information on Billing, click here.

Create a Collection

Now that you've created an account and understand how to use your $300 free trial credits, let's navigate to the Collections tab of the Rockset Console where we will create your first collection!

A collection in Rockset is a set of Rockset documents. Similar to tables in traditional SQL databases, collections can be queried using SQL, either directly or using Query Lambdas.

In this tutorial, we will create two collections from public datasets hosted on AWS S3:

  • The Movies Dataset: Film Releases: A sample of movies and information including their genre, popularity, and revenue.
  • The Movies Dataset: Film Ratings: A sample of movie ratings by user.

Both datasets are publicly available here.

Create the Film Releases Collection

Follow the steps below to create the Film Releases collection:

  1. Click Create your first Collection in the Collections tab of the Rockset Console:

  2. Select Public Datasets as the data source for your collection:

  3. Select the dataset The Movies Dataset: Film Releases and click Start below:

  4. Configure and create the collection:

    • Under the Transform Data section, you will see that an Ingest Transformation has already been predefined for you. This transformation simply processes incoming data as it is written into your Rockset collection by cleaning up the data and defining data types for a few fields. For the purposes of this tutorial, we recommend that you do not change the predefined ingest transformation as our queries later on may be affected.
    • Name your collection film_releases in the Collection Name field.
    • When you're ready, click Create at the top right of the page to complete the creation of your collection.

The source preview is automatically generated so you can explore the semi-structured JSON data in a tabulated form:

The collection creation process will take about 3 minutes to complete. Its initial status will be Created, after which, it will change to Ready. At this point, documents will begin to flow into the collection gradually until the entire dataset is ingested. When completed, you should see something like this:

Note: You may need to refresh the screen for the status to update.

Create the Film Ratings Collection

Now repeat the same steps above to create the Film Ratings collection. This time select the dataset The Movies Dataset: Film Ratings and name your collection film_ratings in the Collection Name field.

Execute a Query

Now that both collections have been set up, we can use SQL to query the two collections.

Sample Query

Below is a sample query to suggest movies to a user based on their genre preference and the movie's rating. Since genre is an array field (as a single movie may fit multiple genres), we use UNNEST to expand this array and create a record for each (genre, movie) pair. We also exclude movies rated by a specified user.

Follow the steps below to enter and use the query:

  1. Navigate to the Query Editor tab of the Rockset Console to start writing and executing SQL queries.

  2. Copy the query below into the SQL editing area:

     SELECT
         m.id,
         m.title
     FROM
         commons.film_releases m,
         UNNEST(m.genres) as genres
     WHERE
         genres.name = 'Action'
         AND m.id NOT IN (
             SELECT
                 r.movie_id
             FROM
                 commons.film_ratings r
             WHERE
                 r.user_id = 100
         )
     ORDER BY
         m.popularity DESC
     ;
    
  3. Click Run to execute the query. The Results tab below the query shows the rows returned from the query:

Sample Query with Parameters

In the above query, we used the Action genre and user 100. Now, let's make these values parameters that can be specified at runtime. Follow the steps below to add and test these parameters:

  1. Click the + next to Parameters located below the query name tab to create a new parameter:

  2. Populate the parameter details with the following and click return on your keyboard to create the new parameter:

    • Set Parameter Name to genre.
    • Set Type to string.
    • Set Parameter Value to Action.

  3. Repeat Step 2 with the following parameter details and click return on your keyboard to create the new parameter:

    • Set Parameter name to user_id.
    • Set Type to int.
    • Set Parameter Value to 100.
  4. Modify the SQL statement from the previous section to incorporate the parameters created in steps 2 and 3 above:

    • Replace genres.name = 'Action' with genres.name = :genre.
    • Replace r.user_id = 100 with r.user_id = :user_id.

Here is the new SQL statement with these updates:

 SELECT
     m.id,
     m.title
 FROM
     commons.film_releases m,
     UNNEST(m.genres) as genres
 WHERE
     genres.name = :genre
     AND m.id NOT IN (
         SELECT
             r.movie_id
         FROM
             commons.film_ratings r
         WHERE
             r.user_id = :user_id
     )
 ORDER BY
     m.popularity DESC
 ;
  1. Click Run to execute the query. The Results tab below the query shows the rows returned for the Action genre with the User ID 100:

Next Steps

This completes the quickstart tutorial! Here are some suggestions for next steps: