Relational Data - Aggregate Functions - Reference - Uniqcombined64

From FojiSoft Docs

Calculates the approximate number of different argument values. It is the same as uniqCombined, but uses a 64-bit hash for all data types rather than just for the String data type.

uniqCombined64(HLL_precision)(x[, ...])

Parameters

  • HLL_precision: The base-2 logarithm of the number of cells in HyperLogLog. Optionally, you can use the function as uniqCombined64(x[, ...]). The default value for HLL_precision is 17, which is effectively 96 KiB of space (2^17 cells, 6 bits each).
  • X: A variable number of parameters. Parameters can be Tuple, Array, Date, DateTime, String, or numeric types.

Returned value

Implementation details

The uniqCombined64 function: - Calculates a hash (64-bit hash for all data types) for all parameters in the aggregate, then uses it in calculations. - Uses a combination of three algorithms: array, hash table, and HyperLogLog with an error correction table. - For a small number of distinct elements, an array is used. - When the set size is larger, a hash table is used. - For a larger number of elements, HyperLogLog is used, which will occupy a fixed amount of memory. - Provides the result deterministically (it does not depend on the query processing order).

Since it uses 64-bit hash for all types, the result does not suffer from very high error for cardinalities significantly larger than UINT_MAX like uniqCombined does, which uses a 32-bit hash for non-String types.


Compared to the uniq function, the uniqCombined64 function:

  • Consumes several times less memory.
  • Calculates with several times higher accuracy.

Example

In the example below uniqCombined64 is run on 1e10 different numbers returning a very close approximation of the number of different argument values.

Query:

SELECT uniqCombined64(number) FROM numbers(1e10);

Result:

┌─uniqCombined64(number)─┐
│             9998568925 │ -- 10.00 billion
└────────────────────────┘

By comparison the uniqCombined function returns a rather poor approximation for an input this size.

Query:

SELECT uniqCombined(number) FROM numbers(1e10);

Result:

┌─uniqCombined(number)─┐
│           5545308725 │ -- 5.55 billion
└──────────────────────┘

See Also