This page describes how to setup a Jupyter notebook and configure it for use with Rockset.

    The steps below are intended for use with Python 3. They require that you have the latest Rockset Python SDK installed. Refer to the Python page for instructions on installing the Python SDK.


    Jupyter notebooks are an effective tool for data scientists to iterate on their work and share it with other data scientists. Rockset has deep integration with the Jupyter notebook workflow. There is deep SQL Magic and ipython-sql integration that lets you run SQL queries directly in your notebooks, turn the results into Pandas dataframes. There is also a Rockset Python SDK integration to do more advanced operations including creating collections and adding, removing, or modifying documents within collections.


    #Local Installation

    You must first install the Rockset Python SDK. After that, you can install the Jupyter package by following the official guide. Ensure that you install the version of Jupyter for Python 3. In addition to Jupyter, you will need a couple of extra packages that can be installed as follows:

    $ python3 -m pip install ipython-sql

    If you're using system Python, you may need to sudo before you can run the above commands. It is recommended to instead create a virtualenv.

    Once you have installed Jupyter and the ipython-sql dependency, you should be able to run it by starting it from the terminal or command prompt.

    $ jupyter notebook

    Once you're in the notebook environment, open up a new notebook and test that the installed version of the Python interpreter is 3.x.

    import sys

    Interpreter version in Jupyter

    If the version does not say 3.x, you may need to install a Python 3 kernel for your Jupyter notebook. This can be done using:

    $ python3 -m pip install ipykernel
    $ python3 -m ipykernel install --user

    When it is installed, you should restart your Jupyter notebook and check that the new Python 3 kernel is available to be selected.

    #Conda Installation

    If you're using Conda enviroments with Jupyter Notebooks, then you can install Rockset using conda install.

    $ conda install --yes --channel conda-forge --channel rockset rockset

    If you have multiple Conda environments then you can pass in the name of the conda environment with the conda install --name option.

    $ conda env list
    $ conda install --name YOUR-CONDA-ENV-NAME --yes --channel conda-forge --channel rockset rockset

    You can run these commands from your Jupyter notebook using the (!) shell escape:

    Conda install from Jupyter

    #Running in Docker

    You can also run a dockerized Jupyter installation. If you don't have docker installed, install docker using the instructions on the official page. Once you have docker installed, you can use any of the images listed in Jupyter Docker Stacks.

    If you're using the image: jupyter/minimal-notebook, you can bring that up as follows.

    $ docker run -p 8888:8888 jupyter/minimal-notebook

    By default, you will lose your notebook when the docker container is terminated. In order to run it with persistent storage, you should mount a host volume into the container. You can pass the -v commandline argument to docker run.

    $ chmod 777 /some/host/folder/for/work
    $ docker run -p 8888:8888 jupyter/minimal-notebook -v /some/host/folder/for/work:/home/jovyan/work

    Once the notebook is running, you can run the following command in the first cell in order to prepare it for use with Rockset.

    $ pip install rockset ipython-sql

    #Testing your Setup

    Once you have installed Jupyter, you can run the following commands in a cell to ensure that you have it setup correctly. Replace your-api-key with your own API key that you have generated from the Rockset console. (Refer to the quickstart if you are unfamiliar with using the Rockset console.)

    from rockset import Client, Q, F
    %load_ext sql
    %config SqlMagic.autocommit = False
    %sql rockset://your-api-key@api.rs2.usw2.rockset.com/
    %sql SELECT 1

    Check the output of the cell for any errors. If no exceptions are thrown, you're all set and Rockset is correctly configured for use with your notebook.


    #SQL Magic

    Prior to using the %sql or %%sql magic syntax to run SQL statements within your Jupyter notebook, you must execute a few statements to establish a connection to Rockset.

    %load_ext sql
    %config SqlMagic.autocommit = False
    %sql rockset://your-api-key@api.rs2.usw2.rockset.com/

    Once the connection is successfully established, you can run queries using either %sql or %%sql in a cell.

    %sql SELECT * from _events LIMIT 10
    LIMIT 10

    Both of the above examples are identical except that the latter allows running multi-line SQL statements. You can run queries against your own collections in this manner.

    #Python SDK

    You can also interact with Rockset using the Python SDK and query builder. Using the Python SDK allows you to read results using cursors and higher level objects, as well as allowing you to create/delete collections, add/remove/modify documents in a collection, etc. The query builder provides an abstraction over SQL syntax to build queries using Python constructs instead of raw SQL statements.

    from rockset import Client, Q, F
    rs = Client()
    result = rs.sql(Q('_events').select(F['details']).limit(10))
    for r in result:

    For detailed information on using the Python Query Builder, you can refer to the Python reference.


    You can also turn your results into a Pandas Dataframe for easy wrangling and/or visualization. If you are using the one-line %sql syntax, you can do the following:

    result = %sql SELECT * from _events LIMIT 1

    Magic to Dataframe

    If you're using the %%sql syntax, you can do this instead:

    %%sql result <<
    SELECT * from "_events" LIMIT 1 # next cell result.DataFrame()

    Magic to Dataframe 2

    #Next Steps

    Once you have Rockset with SQL Magic configured with your Jupyter notebook, you can consider using the Rockset Python SDK which will let you build queries iteratively and use Python syntax throughout.

    Join us on Slack!
    Building on Rockset? Come chat with us!