This page describes how to create collections from CSV files.
Rockset can parse raw CSV data
In this section we will create a collection from a dataset hosted on AWS S3. Click on
Create Collection in the
Overview tab to begin.
Choose an appropriate name, description (optional) and select Amazon S3 as source from the
Add Source dropdown. Provide the AWS S3 bucket name, prefix (if any) and select the integration
Integration Name dropdown or choose
None if the bucket is public.
Select 'CSV' from the
Format dropdown, which will show a few more options to be configured for
CSV format support. Configure them as follows:
- First line of file as column names - Select this option if the CSV source contains column names in the first line
- Specify Columns manually - Select this option if you want to provide custom names for each column in the CSV data source. This option will ask you to provide a name and datatype for each column
- Generate column names automatically - Rockset will automatically generate unique column names (c1, c2, ..) for the CSV data source
Separator - The separator used in the CSV data source (default value is
Encoding - Select the encoding format. Supported encodings are UTF-8, UTF-16, ISO 8859-1
Quote Character - A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters (default value is
Create on the top right to create the collection. You should see a new collection in state
Created and it can take up to a minute for the collection to become