This page describes how to implement field whitelisting using field mappings.
Field whitelisting is used to retain certain fields in a document, and drop all the other fields that do not match. This is useful for ingesting data where the schema is unknown, but where only certain fields are of interest.
The syntax for whitelisting is very similar to the field mapping syntax, except for the addition of a new property:
is_drop_all_fields will drop all fields in each ingested document, except for the fields we specify in output fields.
By default, all fields will be dropped if
is_drop_all_fields is true, except for the
Field whitelisting can be defined in a YAML file specified at collection creation time, such as the one below.
type: COLLECTION name: users field_mappings: - name: drop_all # `is_drop_all_fields` should be specified only once. # Additionally, the mapping specifying `is_drop_all_fields` should not have an output or input field. is_drop_all_fields: true - name: whitelist_name input_fields: - field_name: 'name' param: 'name' output_field: field_name: 'name' value: # An "identity" SQL expression that retains the `name` field. sql: ':name' - name: whitelist_age input_fields: - field_name: 'age' param: 'age' output_field: field_name: 'age' value: # A SQL expression that retains the `age` field while simultaneously casting it to an integer. sql: 'CAST(:age as INTEGER)'
In the example above, we whitelist the
age field, while also applying a SQL expression on the latter. Now, any document that comes in will have all its fields dropped, except for the two whitelisted fields.