Vector Distance Functions

DOT_PRODUCT

DOT_PRODUCT(array, array) Computes the dot product of two vectors.

SELECT DOT_PRODUCT([1, 2, 3, 4], [5, 6, 7, 8])
70

(= 1 5 + 2 6 + 3 7 + 4 8)

SELECT DOT_PRODUCT([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
84.7

(= 1.1 5.5 + 2.2 6.6 + 3.3 7.7 + 4.4 8.8)

SELECT DOT_PRODUCT([5, 6], [1, 2, 3, 4])
Error: Cannot apply operation DOT_PRODUCT on vectors of different sizes 2 and 4.

EUCLIDEAN_DIST

EUCLIDEAN_DIST(array, array) Computes the Euclidean distance of two vectors (which is also referred to as the L2 norm). Euclidean distance represents the square root of the sum of squared differences between corresponding elements of two vectors.

SELECT EUCLIDEAN_DIST([1, 2, 3, 4], [5, 6, 7, 8])
8
SELECT EUCLIDEAN_DIST([1.1, 2.2, 3.3, 4.4], [5.5, 6.6, 7.7, 8.8])
8.8
SELECT EUCLIDEAN_DIST([1, 2, 3, 4], [5.5, 6.6, 7.7, 8.8])
9.302688

COSINE_SIM

COSINE_SIM(array, array) Computes the cosine similarity of two vectors. Cosine similarity represents the dot product of two vectors divided by their magnitude

SELECT COSINE_SIM([1, 2, 3, 4], [5, 6, 7, 8])
0.968864
SELECT COSINE_SIM([1.1, 2.2, 3.3, 4.4], [5.1, 6.2, 7.3, 8.4])
0.971264
SELECT COSINE_SIM([1.1, 2.2, 3.3, 4.4], [1, 1, 1, 1])
0.912871

APPROX_DOT_PRODUCT

Approximate version of DOT_PRODUCT. If the query orders by the result of this function in descending order then an approximate inner_product similarity index will be used if available. Otherwise behavior is the same as DOT_PRODUCT.

SELECT APPROX_DOT_PRODUCT([1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0])
70.0

APPROX_EUCLIDEAN_DIST

Approximate version of EUCLIDEAN_DIST. If the query orders by the result of this function in ascending order then an approximate l2 similarity index will be used if available. Otherwise behavior is the same as EUCLIDEAN_DIST.

SELECT APPROX_EUCLIDEAN_DIST([1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0])
8.0