- Loading Your Data
- Data Sources
- MongoDB Atlas
This page covers how to use a MongoDB Atlas collection as a data source in Rockset. This includes:
- Creating a MongoDB integration to securely connect collections in your MongoDB account with Rockset.
- Create a collection which continuously syncs your data from a MongoDB collection into Rockset in real-time.
For the following steps, you must have access to a MongoDB Atlas account and be able to manage Custom Roles and Database Users within it. If you do not have access, please invite your MongoDB Atlas administrator to Rockset.
#Create a MongoDB Atlas Integration
The steps below show how to set up a MongoDB Atlas integration using MongoDB SCRAM Authentication mechanism. An integration can provide access to one or more MongoDB collections across different databases in the same MongoDB Atlas cluster. You can use an integration to create Rockset collections that continuously sync data from your MongoDB collections.
#Step 1: Configure MongoDB Atlas Custom Role
- Navigate to "Database Access" (from left navigation) in your MongoDB Atlas account for the cluster you want to connect to Rockset.
- Create a new custom role by navigating to "Custom Roles" and clicking "Add New Custom Role". If you already have a role set up for Rockset, you may update that existing role. For more details, refer to MongoDB Atlas documentation on Custom Roles.
- Set up read-only access to your MongoDB collection. Add the following Actions or roles:
collStatsand also enter the names of databases as well as collections for each of these actions or roles. You can update access to databases and collections in Mongo UI at any time without changes required in Rockset integration. The same integration can be used to create more Rockset collections based on permissions.
- Save the newly create or updated custom role and give it a descriptive name. You will attach this custom role to a new or existing Atlas user.
#Why these permissions?
find- Required for initial collection scan when reading data.
changeStream- Required for retrieving records from MongoDB Atlas Change Streams.
collStats- Required for metadata about MongoDB Atlas collections.
#Step 2: Configure MongoDB Atlas User
You'll need to create a MongoDB Atlas user to grant Rockset permissions to access your MongoDB resources.
- Navigate to "Database Access" (from left navigation) in MongoDB Atlas UI.
- Set up a new user by navigating to "Database Users" and clicking "Add New Database User". Note: If you already have a user for Rockset set up, you may re-use it or update the custom role directly. For more details, refer to MongoDB Atlas documentation on Database Users.
- Using SCRAM Password authentication enter a username and password for the database user and select the custom role created in Step 1 under "Database User Privileges".
- Finish by clicking "Add User" and record both username and password in the Rockset Console within a new MongoDB integration. Note that if you change the password later, you will need to drop and recreate the integration in Rockset.
#Step 3: Access Connection String
You'll need to provide connection string for your MongoDB Atlas cluster for Rockset to connect to it.
- Navigate to the cluster (from left navigation) you want to connect to Rockset and click on "Connect".
- Select "Connect your application" for connection method.
- Copy the "Connection String" and record it in the
Rockset Console for the integration. Also provide
the name of the database that connections will use by default. Connection String looks like
mongodb+srv://<username>:<password>@cluster0.mongodb.net/<dbname>. You don't need to replace username, password and dbname tags in the connection string.
#Step 4: Whitelist Rockset IPs
To ensure connectivity with Atlas, you must whitelist the inbound network access from your application environment to MongoDB Atlas by whitelisting public IP addresses. For more details, refer to MongoDB Atlas documentation on Whitelist Entries. This is the most secure and recommended way to allow Rockset to access your MongoDB cluster. Although, if you choose to skip adding Rockset whitelist entries, make sure you select "Allow Access From Anywhere" which enables access to the cluster from anywhere.
- Navigate to "Network Access" (from left navigation) in MongoDB Atlas UI.
- Click on "Add IP Address" and create three whitelist entries for the following public IP
addresses of Rockset service:
#Create a Collection
Once you create a collection backed by MongoDB Atlas, Rockset scans the MongoDB collections to continuously ingest and then subsequently uses the MongoDB Change Stream to update collections as new records are added to the MongoDB collection.
If your MongoDB collection is a capped collection, MongoDB change streams don't receive deletes for old documents and hence Rockset collection can go out of sync. For this we recommend setting retention on Rockset collection at time of creation.
You can create a collection from a Redshift source in the Collections tab of the Rockset Console.
#How it works
When a MongoDB Atlas backed collection is created, indexing in Rockset occurs in two stages:
- A one-time full scan of the MongoDB collection in which all records are indexed and stored in the Rockset collection.
- Following that, continuous monitoring and sync of changes from the MongoDB collection (inserts, deletes and updates) to the Rockset collection in real-time using MongoDB Change Streams.
Once a MongoDB backed collection is set up, it will be a replica of the MongoDB collection, up-to-date to within a few seconds.