This page covers how to use an Amazon DynamoDB table as a data source in Rockset. This includes:
For the following steps, you must have access to an AWS account and be able to manage AWS IAM policies and IAM users within it. If you do not have access, please invite your AWS administrator to Rockset.
The steps below show how to set up an Amazon DynamoDB integration using AWS Access Keys. An integration can provide access to one or more DynamoDB tables within your AWS account. You can use an integration to create collections that sync data from your DynamoDB tables.
<your-table>
with the name of your DynamoDB table. If you already have a Rockset policy set up, you can add the body of the Statement
attribute to it.{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"dynamodb:Scan",
"dynamodb:DescribeStream",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:DescribeTable",
"dynamodb:UpdateTable"
],
"Resource": [
"arn:aws:dynamodb:*:*:table/<your-table>",
"arn:aws:dynamodb:*:*:table/<your-table>/stream/*"
]
}
]
}
dynamodb:Scan
— Required for initial table scans when reading data.dynamodb:DescribeStream
— Required for metadata about DynamoDB streams.dynamodb:GetRecords
— Required for retrieving records from DynamoDB streams.dynamodb:GetShardIterator
— Required for retrieving records from DynamoDB streams.dynamodb:DescribeTable
— Required for metadata about DynamoDB tables.dynamodb:UpdateTable
— Optional. Used to enable streams on a DynamoDB table.
DynamoDB streams are required for live sync to work correctly. You can omit this permission if you would like to enable streams on your tables manually as described here. You have to specify the StreamViewType as NEW_AND_OLD_IMAGES while creating the stream.You can set up permissions for multiple tables, or even all tables by modifying the Resource
ARNs. The format of the ARN for DynamoDB is as follows: arn:aws:dynamodb:region:account-id:table/tablename
.
You can substitute the following resources in the policy above to grant access to multiple tables as shown below:
arn:aws:dynamodb:*:*:table/*
arn:aws:dynamodb:*:*:table/*/stream/*
arn:aws:dynamodb:*:*:table/prod*
arn:aws:dynamodb:*:*:table/prod*/stream/*
arn:aws:dynamodb:us-west-2:*:table/*
arn:aws:dynamodb:us-west-2:*:table/*/stream/*
Please note that you must also include the corresponding /stream/*
permissions with the above for live sync to work correctly. For more details on how to specifiy a resource path, refer to AWS documentation on DynamoDB ARNs.
There are two mechanisms by which you can grant Rockset permissions to access your AWS resource. Although Access Keys are supported, Cross-Account roles are strongly recommended as they are more secure and easier to manage.
The most secure way to grant Rockset access to your AWS account involves giving Rockset’s account cross-account access to your AWS account. To do so, you’ll need to create an IAM Role that assumes your newly created policy on Rockset’s behalf.
You’ll need information from the Rockset Console to create and save this integration.
Navigate to the IAM service in the AWS Management Console.
Setup a new role by navigating to Roles and clicking Create role. Note: if you already have a role for Rockset set up, you may re-use it and either add or update the above policy directly.
Navigate to the IAM service in the AWS Management Console.
Create a new user by navigating to Users and clicking Add User. If you have already created a user for Rockset in the past, you can attach the policy created in the previous section to that user.
Enter a name for the user and check the Programmatic access option. Click to continue.
Choose Attach existing policies directly then select the policy you created in Step 1. Click through the remaining steps to finish creating the user.
When the new user is successfully created you should see the Access key ID and Secret access key displayed on the screen.
If you are attaching the policy to an existing IAM user, you can navigate to “Security Credentials” under the IAM user and generate a new access key.
Once you create a collection backed by Amazon DynamoDB, Rockset scans the DynamoDB tables to continuously ingest and then subsequently uses the stream to update collections as new objects are added to the DynamoDB table. The sync latency is no more than 5 seconds under regular load.
In the Rockset Console, you can create a collection from Workspace > Collections > Create Collection.
Using the CLI, you can run the following:
$ rock create collection my-first-dynamodb-collection \
dynamodb://my-table \
--integration=aws-rockset-readonly
Collection "my-first-dynamodb-collection" was created successfully.
Note that these operations can also be performed using any of the Rockset client libraries.
When a DynamoDB-backed collection is created, ingestion occurs in two stages:
Once a DynamoDB backed collection is set up, it will be a replica of the DynamoDB table, up-to-date to within a few seconds.
Each DynamoDB table is configured with RCUs which represent an upper bound on the read requests a client can issue. Rockset consumes RCUs to perform strongly consistent scans during the initial table full scan. Strongly consistent scans guarantee that no updates to the table are missed between the beginning of the scan and the start of ingestion of the DynamoDB stream.
DynamoDB allows configuring RCUs to enable application read at a faster rate. Likewise you can specify an upper bound on RCUs in Rockset to use during the initial scan. Configuring higher number of RCUs for Rockset in your DynamoDB table will result in faster ingest speeds.
RCUs in DynamoDB can be configured with two different modes: Provisioned and On-Demand