Term Matcher Functions

BOOST

BOOST(boost_value, term) Set the boost value (a positive floating point value) for a specific search term so that the term contributes the specified value. Unboosted terms contribute a default score of 1.0. Note that following sample queries use collection data described at the top of this page.

/*
In this example, `doc2` got:
- 1 point for the occurrence of the word 'hello'
- 3 points for the occurrence of the word 'world'
`doc4` got:
- 1 point for the occurrence of the word 'hello'
*/
SELECT
    _id,
    words,
    score() score
FROM
    data
WHERE
    search(
        has_term(words, 'hello'),
        boost(3, has_term(words, 'world'))
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc2 | ['hello', 'world', 'goodbye', 'world'] | 4 | | doc4 | ['hello', 'goodbye'] | 1 | +------+----------------------------------------+-------+

/*
In this example, `doc4` got:
- 10 points for the occurrence of the word 'hello'
- 1 point for the occurrence of the word 'goodbye'
- 0.5 points due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
`doc2` got:
- 10 points for the occurrence of the word 'hello'
- 1 point for the occurrence of the word 'goodbye'
- 0 points due to proximity range search as the longest run in it ['hello']
  or ['goodbye'] is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity(
            boost(10, has_term(words, 'hello')),
            has_term(words, 'goodbye')
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 11.5 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 11 | +------+----------------------------------------+-------+

/*
In this example, `doc4` got 3 * (10 + 1 + 0.5) + 5 points, where
- 10 points are for the occurrence of the word 'hello'
- 1 point is for the occurrence of the word 'goodbye'
- 0.5 points are due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
- 5 points are for the occurrence of the word 'goodbye'
`doc2` got 3 * (10 + 1 + 0) + 5 points, where
- 10 points are for the occurrence of the word 'hello'
- 1 point is for the occurrence of the word 'goodbye'
- 0 points are due to proximity range search as the longest run in it ['hello']
  or ['goodbye'] is of length 1
- 5 points are for the occurrence of the word 'goodbye'
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        BOOST(
            3,
            proximity(
                BOOST(10, has_term(words, 'hello')),
                has_term(words, 'goodbye')
            )
        ),
        BOOST(5, has_term(words, 'goodbye'))
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 39.5 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 38 | +------+----------------------------------------+-------+

CONTAINS

CONTAINS(field, search_string[, locale])
Tokenizes the given search string (normalized and lower-cased) in the given locale (default: en_US) and create a proximity range from the terms.

/*
In this example, `doc4` got:
- 2 points, 1 point each for the occurrence of words ['hello', 'goodbye']
- 0.5 points due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
`doc2` got:
- 2 points, 1 point each for the occurrence of words ['hello', 'goodbye']
- 0 points due to proximity range search as the longest run in it ['hello']
  or ['goodbye'] is of length 1
*/

SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    SEARCH(CONTAINS(words, 'hello goodbye')) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 2.5 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 2 | +------+----------------------------------------+-------+

/*
In this example, `doc1` got:
- 5 points, 1 point each for the occurrence of words
  ['quick', 'brown', 'fox', 'jumps' 'over']
- 2 points due to proximity range search as the longest run in it
  ['quick', 'brown', 'fox', 'jumps', 'over'] is of length 5
`doc3` got:
- 1 point for the occurrence of the word 'fox'
- 0 points due to proximity range search as the longest run in it ['fox']
  is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    SEARCH(
        CONTAINS(
            words,
            'quick brown fox jumps over that elephant'
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC;

+------+-------------------------------------------------------------------------+-------+ | id | words | score | |------+-------------------------------------------------------------------------|-------| | doc1 | ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog'] | 7 | | doc3 | ['cat', 'dog', 'fox'] | 1 | +------+-------------------------------------------------------------------------+-------+

/*
In this example, `doc1` got:
- 3 points, 1 point each for the occurrence of words ['brown', 'fox', 'over']
- 0.5 points due to proximity range search as the longest run in it
  ['brown', 'fox'] is of length 2
`doc3` got:
- 1 point for the occurrence of the word 'fox'
- 0 points due to proximity range search as the longest run in it ['fox']
  is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    SEARCH(
        CONTAINS(
            words,
            'majestic brown fox jumped over that elephant'
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC;

+------+-------------------------------------------------------------------------+-------+ | id | words | score | |------+-------------------------------------------------------------------------|-------| | doc1 | ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog'] | 3.5 | | doc3 | ['cat', 'dog', 'fox'] | 1 | +------+-------------------------------------------------------------------------+-------+

🚧
Using NGRAMS in search queries with CONTAINS
Using CONTAINS lower-cases the string and will only match lower-cased instances. For example, if you have the search string 'Hello', the search will actually be for the lower-cased ['hello'] and will only match lower-cased instances.
Because of this, if you are using CONTAINS with NGRAMS you must apply the LOWER function before the NGRAMS function to get an accurate search result.
SELECT NGRAMS(LOWER('Hello'), 2)

HAS_TERM

HAS_TERM(field, term) A term matcher that lets you search documents where the specified field contains the specified term. Upon a successful match, this matcher contributes a score of 1. Note that following sample queries use collection data described at the top of this page.

/*
In this example, `doc2` got 2 points, 1 point each for the occurrence of words
['hello', 'world']. No other document contains both 'hello' and 'world' in the
field words.
*/
SELECT
    _id,
    words,
    score() score
FROM
    data
WHERE
    search(
        has_term(words, 'hello'),
        has_term(words, 'world')
    )
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc2 | ['hello', 'world', 'goodbye', 'world'] | 2 | +------+----------------------------------------+-------+

/*
In this example, `doc2` got 2 points, 1 point each for the occurrence of words
['hello', 'world']. `doc4` got 1 point for the occurrence of the word 'hello'.
*/
SELECT
    _id,
    words,
    score() score
FROM
    data
WHERE
    search(
        has_term(words, 'hello'),
        has_term(words, 'world')
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc2 | ['hello', 'world', 'goodbye', 'world'] | 2 | | doc4 | ['hello', 'goodbye'] | 1 | +------+----------------------------------------+-------+

PROXIMITY

PROXIMITY(term_matcher[, term_matcher]*) Create a proximity range matcher using the given term matchers. The term matchers will contribute 1 point to the final score, just like other term matchers in the SEARCH query. But the proximity range mather will also contribute a proximity score that depends on the longest consecutive match. The longest consecutive match is the longest sequence of N consecutive terms such that: - term at index T satisfies term matcher at index P - term at index T+1 satisfies term matcher at index P+1 - ... - term at index T+N-1 satisfies term matcher at index P+N-1 Based on the length N of the longest consecutive match, the proximity score is computed as ((N - 1) / 2). Note that following sample queries use collection data described at the top of this page.

/*
In this example, `doc4` got:
- 2 points, 1 point each for the occurrence of words ['hello', 'goodbye']
- 0.5 points due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
`doc2` got:
- 2 points, 1 point each for the occurrence of words ['hello', 'goodbye']
- 0 points due to proximity range search as the longest run in it ['hello']
  or ['goodbye'] is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity(
            has_term(words, 'hello'),
            has_term(words, 'goodbye')
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 2.5 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 2 | +------+----------------------------------------+-------+

/*
In this example, `doc1` got:
- 5 points, 1 point each for the occurrence of words
  ['quick', 'brown', 'fox', 'jumps' 'over']
- 2 points due to proximity range search as the longest run in it
  ['quick', 'brown', 'fox', 'jumps', 'over'] is of length 5
`doc3` got:
- 1 point for the occurrence of the word 'fox'
- 0 points due to proximity range search as the longest run in it ['fox']
  is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity(
            has_term(words, 'quick'),
            has_term(words, 'brown'),
            has_term(words, 'fox'),
            has_term(words, 'jumps'),
            has_term(words, 'over'),
            has_term(words, 'that'),
            has_term(words, 'elephant')
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC;

/*
In this example, `doc1` got:
- 3 points, 1 point each for the occurrence of words ['brown', 'fox', 'over']
- 0.5 points due to proximity range search as the longest run in it
  ['brown', 'fox'] is of length 2
`doc3` got:
- 1 point for the occurrence of the word 'fox'
- 0 points due to proximity range search as the longest run in it ['fox']
  is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity(
            has_term(words, 'majestic'),
            has_term(words, 'brown'),
            has_term(words, 'fox'),
            has_term(words, 'jumped'),
            has_term(words, 'over'),
            has_term(words, 'that'),
            has_term(words, 'elephant')
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC;

+------+-------------------------------------------------------------------------+-------+ | id | words | score | |------+-------------------------------------------------------------------------|-------| | doc1 | ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog'] | 3.5 | | doc3 | ['cat', 'dog', 'fox'] | 1 | +------+-------------------------------------------------------------------------+-------+

PROXIMITY_BOOST

PROXIMITY_BOOST(boost_value, proximity_range) Set the proximity range boost value (a positive floating point value) for the given specific search proximity range. This affects the proximity range's contribution to the score i.e. the proximity score, but not the contributions of the individual terms in the range. proximity_range must be a proximity range created using either PROXIMITY or CONTAINS. Note that following sample queries use collection data described at the top of this page.

/*
In this example, `doc4` got 2 + 10 * 0.5 points, where
- 2 points are for the occurrence of the words ['hello', 'goodbye']
- 0.5 points are due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
`doc2` got 2 + 10 * 0 points, where
- 2 points are for the occurrence of the words ['hello', 'goodbye']
- 10 * 0 points due to proximity range search as the longest run in it ['hello']
  or ['goodbye'] is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity_boost(
            10,
            proximity(
                has_term(words, 'hello'),
                has_term(words, 'goodbye')
            )
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 7 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 2 | +------+----------------------------------------+-------+

/*
In this example, `doc4` got 5 + 1 + 10 * 0.5 points, where
- 5 points are for the occurrence of the word 'hello'
- 1 point is for the occurrence of the word 'goodbye'
- 0.5 points are due to proximity range search as the longest run in it
  ['hello', 'goodbye'] is of length 2
`doc2` got 5 + 1 + 10 * 0 points, where
- 5 points are for the occurrence of the word 'hello'
- 1 point is for the occurrence of the word 'goodbye'
- 10 * 0 points are due to proximity range search as the longest run in it
  ['hello'] or ['goodbye'] is of length 1
*/
SELECT
    _id id,
    words,
    score() score
FROM
    data
WHERE
    search(
        proximity_boost(
            10,
            proximity(
                boost(5, has_term(words, 'hello')),
                has_term(words, 'goodbye')
            )
        )
    ) OPTION(match_all = false)
ORDER BY
    score() DESC

+------+----------------------------------------+-------+ | id | words | score | |------+----------------------------------------|-------| | doc4 | ['hello', 'goodbye'] | 11 | | doc2 | ['hello', 'world', 'goodbye', 'world'] | 6 | +------+----------------------------------------+-------+