Relational Data - Functions - Other Functions

From FojiSoft Docs
Revision as of 18:52, 28 August 2024 by Chris.Hansen (talk | contribs) (Import ClickHouse Docs: Wed Aug 28 2024 14:52:29 GMT-0400 (Eastern Daylight Time))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

hostName

Returns the name of the host on which this function was executed. If the function executes on a remote server (distributed processing), the remote server name is returned. If the function executes in the context of a distributed table, it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

hostName()

Returned value

getMacro

Returns a named value from the macros section of the server configuration.

Syntax

getMacro(name);

Arguments

  • name — Macro name to retrieve from the <macros> section. String.

Returned value

  • Value of the specified macro. String.

Example

Example <macros> section in the server configuration file:

<macros>
    <test>Value</test>
</macros>

Query:

SELECT getMacro('test');

Result:

┌─getMacro('test')─┐
│ Value            │
└──────────────────┘

The same value can be retrieved as follows:

SELECT * FROM system.macros
WHERE macro = 'test';
┌─macro─┬─substitution─┐
│ test  │ Value        │
└───────┴──────────────┘

fqdn

Returns the fully qualified domain name of the ClickHouse server.

Syntax

fqdn();

Aliases: fullHostName, FQDN.

Returned value

  • String with the fully qualified domain name. String.

Example

SELECT FQDN();

Result:

┌─FQDN()──────────────────────────┐
│ clickhouse.ru-central1.internal │
└─────────────────────────────────┘

basename

Extracts the tail of a string following its last slash or backslash. This function if often used to extract the filename from a path.

basename(expr)

Arguments

  • expr — A value of type String. Backslashes must be escaped.

Returned Value

A string that contains:

  • The tail of the input string after its last slash or backslash. If the input string ends with a slash or backslash (e.g. / or c:\), the function returns an empty string.
  • The original string if there are no slashes or backslashes.

Example

Query:

SELECT 'some/long/path/to/file' AS a, basename(a)

Result:

┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file                                   │
└────────────────────────┴────────────────────────────────────────┘

Query:

SELECT 'some\\long\\path\\to\\file' AS a, basename(a)

Result:

┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file                                   │
└────────────────────────┴────────────────────────────────────────┘

Query:

SELECT 'some-file-name' AS a, basename(a)

Result:

┌─a──────────────┬─basename('some-file-name')─┐
│ some-file-name │ some-file-name             │
└────────────────┴────────────────────────────┘

visibleWidth

Calculates the approximate width when outputting values to the console in text format (tab-separated). This function is used by the system to implement Pretty formats.

NULL is represented as a string corresponding to NULL in Pretty formats.

Syntax

visibleWidth(x)

Example

Query:

SELECT visibleWidth(NULL)

Result:

┌─visibleWidth(NULL)─┐
│                  4 │
└────────────────────┘

toTypeName

Returns the type name of the passed argument.

If NULL is passed, then the function returns type Nullable(Nothing), which corresponds to ClickHouse’s internal NULL representation.

Syntax

toTypeName(x)

blockSize

In ClickHouse, queries are processed in blocks (chunks). This function returns the size (row count) of the block the function is called on.

Syntax

blockSize()

Example

Query:

DROP TABLE IF EXISTS test;
CREATE TABLE test (n UInt8) ENGINE = Memory;

INSERT INTO test
SELECT * FROM system.numbers LIMIT 5;

SELECT blockSize()
FROM test;

Result:

   ┌─blockSize()─┐
1. │           5 │
2. │           5 │
3. │           5 │
4. │           5 │
5. │           5 │
   └─────────────┘ 

byteSize

Returns an estimation of uncompressed byte size of its arguments in memory.

Syntax

byteSize(argument [, ...])

Arguments

  • argument — Value.

Returned value

  • Estimation of byte size of the arguments in memory. UInt64.

Examples

For String arguments, the function returns the string length + 9 (terminating zero + length).

Query:

SELECT byteSize('string');

Result:

┌─byteSize('string')─┐
│                 15 │
└────────────────────┘

Query:

CREATE TABLE test
(
    `key` Int32,
    `u8` UInt8,
    `u16` UInt16,
    `u32` UInt32,
    `u64` UInt64,
    `i8` Int8,
    `i16` Int16,
    `i32` Int32,
    `i64` Int64,
    `f32` Float32,
    `f64` Float64
)
ENGINE = MergeTree
ORDER BY key;

INSERT INTO test VALUES(1, 8, 16, 32, 64,  -8, -16, -32, -64, 32.32, 64.64);

SELECT key, byteSize(u8) AS `byteSize(UInt8)`, byteSize(u16) AS `byteSize(UInt16)`, byteSize(u32) AS `byteSize(UInt32)`, byteSize(u64) AS `byteSize(UInt64)`, byteSize(i8) AS `byteSize(Int8)`, byteSize(i16) AS `byteSize(Int16)`, byteSize(i32) AS `byteSize(Int32)`, byteSize(i64) AS `byteSize(Int64)`, byteSize(f32) AS `byteSize(Float32)`, byteSize(f64) AS `byteSize(Float64)` FROM test ORDER BY key ASC FORMAT Vertical;

Result:

Row 1:
──────
key:               1
byteSize(UInt8):   1
byteSize(UInt16):  2
byteSize(UInt32):  4
byteSize(UInt64):  8
byteSize(Int8):    1
byteSize(Int16):   2
byteSize(Int32):   4
byteSize(Int64):   8
byteSize(Float32): 4
byteSize(Float64): 8

If the function has multiple arguments, the function accumulates their byte sizes.

Query:

SELECT byteSize(NULL, 1, 0.3, '');

Result:

┌─byteSize(NULL, 1, 0.3, '')─┐
│                         19 │
└────────────────────────────┘

materialize

Turns a constant into a full column containing a single value. Full columns and constants are represented differently in memory. Functions usually execute different code for normal and constant arguments, although the result should typically be the same. This function can be used to debug this behavior.

Syntax

materialize(x)

Parameters

Returned value

  • A column containing a single value x.

Example

In the example below the countMatches function expects a constant second argument. This behaviour can be debugged by using the materialize function to turn a constant into a full column, verifying that the function throws an error for a non-constant argument.

Query:

SELECT countMatches('foobarfoo', 'foo');
SELECT countMatches('foobarfoo', materialize('foo'));

Result:

2
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Illegal type of argument #2 'pattern' of function countMatches, expected constant String, got String

ignore

Accepts any arguments, including NULL and does nothing. Always returns 0. The argument is internally still evaluated. Useful e.g. for benchmarks.

Syntax

ignore(x)

sleep

Used to introduce a delay or pause in the execution of a query. It is primarily used for testing and debugging purposes.

Syntax

sleep(seconds)

Arguments

  • seconds: UInt* or Float The number of seconds to pause the query execution to a maximum of 3 seconds. It can be a floating-point value to specify fractional seconds.

Returned value

This function does not return any value.

Example

SELECT sleep(2);

This function does not return any value. However, if you run the function with clickhouse client you will see something similar to:

SELECT sleep(2)

Query id: 8aa9943e-a686-45e1-8317-6e8e3a5596ac

┌─sleep(2)─┐
│        0 │
└──────────┘

1 row in set. Elapsed: 2.012 sec.

This query will pause for 2 seconds before completing. During this time, no results will be returned, and the query will appear to be hanging or unresponsive.

Implementation details

The sleep() function is generally not used in production environments, as it can negatively impact query performance and system responsiveness. However, it can be useful in the following scenarios:

  1. Testing: When testing or benchmarking ClickHouse, you may want to simulate delays or introduce pauses to observe how the system behaves under certain conditions.
  2. Debugging: If you need to examine the state of the system or the execution of a query at a specific point in time, you can use sleep() to introduce a pause, allowing you to inspect or collect relevant information.
  3. Simulation: In some cases, you may want to simulate real-world scenarios where delays or pauses occur, such as network latency or external system dependencies.

It’s important to use the sleep() function judiciously and only when necessary, as it can potentially impact the overall performance and responsiveness of your ClickHouse system.

sleepEachRow

Pauses the execution of a query for a specified number of seconds for each row in the result set.

Syntax

sleepEachRow(seconds)

Arguments

  • seconds: UInt* or Float* The number of seconds to pause the query execution for each row in the result set to a maximum of 3 seconds. It can be a floating-point value to specify fractional seconds.

Returned value

This function returns the same input values as it receives, without modifying them.

Example

SELECT number, sleepEachRow(0.5) FROM system.numbers LIMIT 5;
┌─number─┬─sleepEachRow(0.5)─┐
│      0 │                 0 │
│      1 │                 0 │
│      2 │                 0 │
│      3 │                 0 │
│      4 │                 0 │
└────────┴───────────────────┘

But the output will be delayed, with a 0.5-second pause between each row.

The sleepEachRow() function is primarily used for testing and debugging purposes, similar to the sleep() function. It allows you to simulate delays or introduce pauses in the processing of each row, which can be useful in scenarios such as:

  1. Testing: When testing or benchmarking ClickHouse’s performance under specific conditions, you can use sleepEachRow() to simulate delays or introduce pauses for each row processed.
  2. Debugging: If you need to examine the state of the system or the execution of a query for each row processed, you can use sleepEachRow() to introduce pauses, allowing you to inspect or collect relevant information.
  3. Simulation: In some cases, you may want to simulate real-world scenarios where delays or pauses occur for each row processed, such as when dealing with external systems or network latencies.

Like the sleep() function, it’s important to use sleepEachRow() judiciously and only when necessary, as it can significantly impact the overall performance and responsiveness of your ClickHouse system, especially when dealing with large result sets.

currentDatabase

Returns the name of the current database. Useful in table engine parameters of CREATE TABLE queries where you need to specify the database.

Syntax

currentDatabase()

currentUser

Returns the name of the current user. In case of a distributed query, the name of the user who initiated the query is returned.

Syntax

currentUser()

Aliases: user(), USER(), current_user(). Aliases are case insensitive.

Returned values

  • The name of the current user. String.
  • In distributed queries, the login of the user who initiated the query. String.

Example

SELECT currentUser();

Result:

┌─currentUser()─┐
│ default       │
└───────────────┘

isConstant

Returns whether the argument is a constant expression.

A constant expression is an expression whose result is known during query analysis, i.e. before execution. For example, expressions over literals are constant expressions.

This function is mostly intended for development, debugging and demonstration.

Syntax

isConstant(x)

Arguments

  • x — Expression to check.

Returned values

  • 1 if x is constant. UInt8.
  • 0 if x is non-constant. UInt8.

Examples

Query:

SELECT isConstant(x + 1) FROM (SELECT 43 AS x)

Result:

┌─isConstant(plus(x, 1))─┐
│                      1 │
└────────────────────────┘

Query:

WITH 3.14 AS pi SELECT isConstant(cos(pi))

Result:

┌─isConstant(cos(pi))─┐
│                   1 │
└─────────────────────┘

Query:

SELECT isConstant(number) FROM numbers(1)

Result:

┌─isConstant(number)─┐
│                  0 │
└────────────────────┘

hasColumnInTable

Given the database name, the table name, and the column name as constant strings, returns 1 if the given column exists, otherwise 0.

Syntax

hasColumnInTable(\[‘hostname’\[, ‘username’\[, ‘password’\]\],\] ‘database’, ‘table’, ‘column’)

Parameters

Returned value

  • 1 if the given column exists.
  • 0, otherwise.

Implementation details

For elements in a nested data structure, the function checks for the existence of a column. For the nested data structure itself, the function returns 0.

Example

Query:

SELECT hasColumnInTable('system','metrics','metric')
1
SELECT hasColumnInTable('system','metrics','non-existing_column')
0

hasThreadFuzzer

Returns whether Thread Fuzzer is effective. It can be used in tests to prevent runs from being too long.

Syntax

hasThreadFuzzer();

bar

Builds a bar chart.

bar(x, min, max, width) draws a band with width proportional to (x - min) and equal to width characters when x = max.

Arguments

  • x — Size to display.
  • min, max — Integer constants. The value must fit in Int64.
  • width — Constant, positive integer, can be fractional.

The band is drawn with accuracy to one eighth of a symbol.

Example:

SELECT
    toHour(EventTime) AS h,
    count() AS c,
    bar(c, 0, 600000, 20) AS bar
FROM test.hits
GROUP BY h
ORDER BY h ASC
┌──h─┬──────c─┬─bar────────────────┐
│  0 │ 292907 │ █████████▋         │
│  1 │ 180563 │ ██████             │
│  2 │ 114861 │ ███▋               │
│  3 │  85069 │ ██▋                │
│  4 │  68543 │ ██▎                │
│  5 │  78116 │ ██▌                │
│  6 │ 113474 │ ███▋               │
│  7 │ 170678 │ █████▋             │
│  8 │ 278380 │ █████████▎         │
│  9 │ 391053 │ █████████████      │
│ 10 │ 457681 │ ███████████████▎   │
│ 11 │ 493667 │ ████████████████▍  │
│ 12 │ 509641 │ ████████████████▊  │
│ 13 │ 522947 │ █████████████████▍ │
│ 14 │ 539954 │ █████████████████▊ │
│ 15 │ 528460 │ █████████████████▌ │
│ 16 │ 539201 │ █████████████████▊ │
│ 17 │ 523539 │ █████████████████▍ │
│ 18 │ 506467 │ ████████████████▊  │
│ 19 │ 520915 │ █████████████████▎ │
│ 20 │ 521665 │ █████████████████▍ │
│ 21 │ 542078 │ ██████████████████ │
│ 22 │ 493642 │ ████████████████▍  │
│ 23 │ 400397 │ █████████████▎     │
└────┴────────┴────────────────────┘

transform

Transforms a value according to the explicitly defined mapping of some elements to other ones. There are two variations of this function:

transform(x, array_from, array_to, default)

x – What to transform.

array_from – Constant array of values to convert.

array_to – Constant array of values to convert the values in ‘from’ to.

default – Which value to use if ‘x’ is not equal to any of the values in ‘from’.

array_from and array_to must have equally many elements.

Signature:

For x equal to one of the elements in array_from, the function returns the corresponding element in array_to, i.e. the one at the same array index. Otherwise, it returns default. If multiple matching elements exist array_from, it returns the element corresponding to the first of them.

transform(T, Array(T), Array(U), U) -> U

T and U can be numeric, string, or Date or DateTime types. The same letter (T or U) means that types must be mutually compatible and not necessarily equal. For example, the first argument could have type Int64, while the second argument could have type Array(UInt16).

Example:

SELECT
    transform(SearchEngineID, [2, 3], ['Yandex', 'Google'], 'Other') AS title,
    count() AS c
FROM test.hits
WHERE SearchEngineID != 0
GROUP BY title
ORDER BY c DESC
┌─title─────┬──────c─┐
│ Yandex    │ 498635 │
│ Google    │ 229872 │
│ Other     │ 104472 │
└───────────┴────────┘

transform(x, array_from, array_to)

Similar to the other variation but has no ‘default’ argument. In case no match can be found, x is returned.

Example:

SELECT
    transform(domain(Referer), ['yandex.ru', 'google.ru', 'vkontakte.ru'], ['www.yandex', 'example.com', 'vk.com']) AS s,
    count() AS c
FROM test.hits
GROUP BY domain(Referer)
ORDER BY count() DESC
LIMIT 10
┌─s──────────────┬───────c─┐
│                │ 2906259 │
│ www.yandex     │  867767 │
│ ███████.ru     │  313599 │
│ mail.yandex.ru │  107147 │
│ ██████.ru      │  100355 │
│ █████████.ru   │   65040 │
│ news.yandex.ru │   64515 │
│ ██████.net     │   59141 │
│ example.com    │   57316 │
└────────────────┴─────────┘

formatReadableDecimalSize

Given a size (number of bytes), this function returns a readable, rounded size with suffix (KB, MB, etc.) as string.

The opposite operations of this function are parseReadableSize, parseReadableSizeOrZero, and parseReadableSizeOrNull.

Syntax

formatReadableDecimalSize(x)

Example

Query:

SELECT
    arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
    formatReadableDecimalSize(filesize_bytes) AS filesize

Result:

┌─filesize_bytes─┬─filesize───┐
│              1 │ 1.00 B     │
│           1024 │ 1.02 KB   │
│        1048576 │ 1.05 MB   │
│      192851925 │ 192.85 MB │
└────────────────┴────────────┘

formatReadableSize

Given a size (number of bytes), this function returns a readable, rounded size with suffix (KiB, MiB, etc.) as string.

The opposite operations of this function are parseReadableSize, parseReadableSizeOrZero, and parseReadableSizeOrNull.

Syntax

formatReadableSize(x)

Alias: FORMAT_BYTES.

Example

Query:

SELECT
    arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
    formatReadableSize(filesize_bytes) AS filesize

Result:

┌─filesize_bytes─┬─filesize───┐
│              1 │ 1.00 B     │
│           1024 │ 1.00 KiB   │
│        1048576 │ 1.00 MiB   │
│      192851925 │ 183.92 MiB │
└────────────────┴────────────┘

formatReadableQuantity

Given a number, this function returns a rounded number with suffix (thousand, million, billion, etc.) as string.

Syntax

formatReadableQuantity(x)

Example

Query:

SELECT
    arrayJoin([1024, 1234 * 1000, (4567 * 1000) * 1000, 98765432101234]) AS number,
    formatReadableQuantity(number) AS number_for_humans

Result:

┌─────────number─┬─number_for_humans─┐
│           1024 │ 1.02 thousand     │
│        1234000 │ 1.23 million      │
│     4567000000 │ 4.57 billion      │
│ 98765432101234 │ 98.77 trillion    │
└────────────────┴───────────────────┘

formatReadableTimeDelta

Given a time interval (delta) in seconds, this function returns a time delta with year/month/day/hour/minute/second/millisecond/microsecond/nanosecond as string.

Syntax

formatReadableTimeDelta(column[, maximum_unit, minimum_unit])

Arguments

  • column — A column with a numeric time delta.
  • maximum_unit — Optional. Maximum unit to show.
    • Acceptable values: nanoseconds, microseconds, milliseconds, seconds, minutes, hours, days, months, years.
    • Default value: years.
  • minimum_unit — Optional. Minimum unit to show. All smaller units are truncated.
    • Acceptable values: nanoseconds, microseconds, milliseconds, seconds, minutes, hours, days, months, years.
    • If explicitly specified value is bigger than maximum_unit, an exception will be thrown.
    • Default value: seconds if maximum_unit is seconds or bigger, nanoseconds otherwise.

Example

SELECT
    arrayJoin([100, 12345, 432546534]) AS elapsed,
    formatReadableTimeDelta(elapsed) AS time_delta
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│        100 │ 1 minute and 40 seconds                                         │
│      12345 │ 3 hours, 25 minutes and 45 seconds                              │
│  432546534 │ 13 years, 8 months, 17 days, 7 hours, 48 minutes and 54 seconds │
└────────────┴─────────────────────────────────────────────────────────────────┘
SELECT
    arrayJoin([100, 12345, 432546534]) AS elapsed,
    formatReadableTimeDelta(elapsed, 'minutes') AS time_delta
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│        100 │ 1 minute and 40 seconds                                         │
│      12345 │ 205 minutes and 45 seconds                                      │
│  432546534 │ 7209108 minutes and 54 seconds                                  │
└────────────┴─────────────────────────────────────────────────────────────────┘
SELECT
    arrayJoin([100, 12345, 432546534.00000006]) AS elapsed,
    formatReadableTimeDelta(elapsed, 'minutes', 'nanoseconds') AS time_delta
┌────────────elapsed─┬─time_delta─────────────────────────────────────┐
│                100 │ 1 minute and 40 seconds                        │
│              12345 │ 205 minutes and 45 seconds                     │
│ 432546534.00000006 │ 7209108 minutes, 54 seconds and 60 nanoseconds │
└────────────────────┴────────────────────────────────────────────────┘

parseReadableSize

Given a string containing a byte size and B, KiB, KB, MiB, MB, etc. as a unit (i.e. ISO/IEC 80000-13 or decimal byte unit), this function returns the corresponding number of bytes.
If the function is unable to parse the input value, it throws an exception.

The inverse operations of this function are formatReadableSize and formatReadableDecimalSize.

Syntax

formatReadableSize(x)

Arguments

  • x : Readable size with ISO/IEC 80000-13 or decimal byte unit (String).

Returned value

  • Number of bytes, rounded up to the nearest integer (UInt64).

Example

SELECT
    arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB']) AS readable_sizes,  
    parseReadableSize(readable_sizes) AS sizes;
┌─readable_sizes─┬───sizes─┐
│ 1 B            │       1 │
│ 1 KiB          │    1024 │
│ 3 MB           │ 3000000 │
│ 5.314 KiB      │    5442 │
└────────────────┴─────────┘

parseReadableSizeOrNull

Given a string containing a byte size and B, KiB, KB, MiB, MB, etc. as a unit (i.e. ISO/IEC 80000-13 or decimal byte unit), this function returns the corresponding number of bytes.
If the function is unable to parse the input value, it returns NULL.

The inverse operations of this function are formatReadableSize and formatReadableDecimalSize.

Syntax

parseReadableSizeOrNull(x)

Arguments

  • x : Readable size with ISO/IEC 80000-13 or decimal byte unit (String).

Returned value

  • Number of bytes, rounded up to the nearest integer, or NULL if unable to parse the input (Nullable(UInt64)).

Example

SELECT
    arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB', 'invalid']) AS readable_sizes,  
    parseReadableSizeOrNull(readable_sizes) AS sizes;
┌─readable_sizes─┬───sizes─┐
│ 1 B            │       1 │
│ 1 KiB          │    1024 │
│ 3 MB           │ 3000000 │
│ 5.314 KiB      │    5442 │
│ invalid        │    ᴺᵁᴸᴸ │
└────────────────┴─────────┘

parseReadableSizeOrZero

Given a string containing a byte size and B, KiB, KB, MiB, MB, etc. as a unit (i.e. ISO/IEC 80000-13 or decimal byte unit), this function returns the corresponding number of bytes. If the function is unable to parse the input value, it returns 0.

The inverse operations of this function are formatReadableSize and formatReadableDecimalSize.

Syntax

parseReadableSizeOrZero(x)

Arguments

  • x : Readable size with ISO/IEC 80000-13 or decimal byte unit (String).

Returned value

  • Number of bytes, rounded up to the nearest integer, or 0 if unable to parse the input (UInt64).

Example

SELECT
    arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB', 'invalid']) AS readable_sizes,  
    parseReadableSizeOrZero(readable_sizes) AS sizes;
┌─readable_sizes─┬───sizes─┐
│ 1 B            │       1 │
│ 1 KiB          │    1024 │
│ 3 MB           │ 3000000 │
│ 5.314 KiB      │    5442 │
│ invalid        │       0 │
└────────────────┴─────────┘

parseTimeDelta

Parse a sequence of numbers followed by something resembling a time unit.

Syntax

parseTimeDelta(timestr)

Arguments

  • timestr — A sequence of numbers followed by something resembling a time unit.

Returned value

  • A floating-point number with the number of seconds.

Example

SELECT parseTimeDelta('11s+22min')
┌─parseTimeDelta('11s+22min')─┐
│                        1331 │
└─────────────────────────────┘
SELECT parseTimeDelta('1yr2mo')
┌─parseTimeDelta('1yr2mo')─┐
│                 36806400 │
└──────────────────────────┘

least

Returns the smaller value of a and b.

Syntax

least(a, b)

greatest

Returns the larger value of a and b.

Syntax

greatest(a, b)

uptime

Returns the server’s uptime in seconds. If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

uptime()

Returned value

  • Time value of seconds. UInt32.

Example

Query:

SELECT uptime() as Uptime;

Result:

┌─Uptime─┐
│  55867 │
└────────┘

version

Returns the current version of ClickHouse as a string in the form of:

  • Major version
  • Minor version
  • Patch version
  • Number of commits since the previous stable release.
major_version.minor_version.patch_version.number_of_commits_since_the_previous_stable_release

If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise, it produces a constant value.

Syntax

version()

Arguments

None.

Returned value

  • Current version of ClickHouse. String.

Implementation details

None.

Example

Query:

SELECT version()

Result:

┌─version()─┐
│ 24.2.1.1  │
└───────────┘

buildId

Returns the build ID generated by a compiler for the running ClickHouse server binary. If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

buildId()

blockNumber

Returns a monotonically increasing sequence number of the block containing the row. The returned block number is updated on a best-effort basis, i.e. it may not be fully accurate.

Syntax

blockNumber()

Returned value

  • Sequence number of the data block where the row is located. UInt64.

Example

Query:

SELECT blockNumber()
FROM
(
    SELECT *
    FROM system.numbers
    LIMIT 10
) SETTINGS max_block_size = 2

Result:

┌─blockNumber()─┐
│             7 │
│             7 │
└───────────────┘
┌─blockNumber()─┐
│             8 │
│             8 │
└───────────────┘
┌─blockNumber()─┐
│             9 │
│             9 │
└───────────────┘
┌─blockNumber()─┐
│            10 │
│            10 │
└───────────────┘
┌─blockNumber()─┐
│            11 │
│            11 │
└───────────────┘

rowNumberInBlock

Returns for each block processed by rowNumberInBlock the number of the current row. The returned number starts for each block at 0.

Syntax

rowNumberInBlock()

Returned value

  • Ordinal number of the row in the data block starting from 0. UInt64.

Example

Query:

SELECT rowNumberInBlock()
FROM
(
    SELECT *
    FROM system.numbers_mt
    LIMIT 10
) SETTINGS max_block_size = 2

Result:

┌─rowNumberInBlock()─┐
│                  0 │
│                  1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│                  0 │
│                  1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│                  0 │
│                  1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│                  0 │
│                  1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│                  0 │
│                  1 │
└────────────────────┘

rowNumberInAllBlocks

Returns a unique row number for each row processed by rowNumberInAllBlocks. The returned numbers start at 0.

Syntax

rowNumberInAllBlocks()

Returned value

  • Ordinal number of the row in the data block starting from 0. UInt64.

Example

Query:

SELECT rowNumberInAllBlocks()
FROM
(
    SELECT *
    FROM system.numbers_mt
    LIMIT 10
)
SETTINGS max_block_size = 2

Result:

┌─rowNumberInAllBlocks()─┐
│                      0 │
│                      1 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│                      4 │
│                      5 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│                      2 │
│                      3 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│                      6 │
│                      7 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│                      8 │
│                      9 │
└────────────────────────┘

neighbor

The window function that provides access to a row at a specified offset before or after the current row of a given column.

Syntax

neighbor(column, offset[, default_value])

The result of the function depends on the affected data blocks and the order of data in the block.

Only returns neighbor inside the currently processed data block. Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.


The order of rows during calculation of neighbor() can differ from the order of rows returned to the user. To prevent that you can create a subquery with ORDER BY and call the function from outside the subquery.

Arguments

  • column — A column name or scalar expression.
  • offset — The number of rows to look before or ahead of the current row in column. Int64.
  • default_value — Optional. The returned value if offset is beyond the block boundaries. Type of data blocks affected.

Returned values

  • Value of column with offset distance from current row, if offset is not outside the block boundaries.
  • The default value of column or default_value (if given), if offset is outside the block boundaries.

The return type will be that of the data blocks affected or the default value type.


Example

Query:

SELECT number, neighbor(number, 2) FROM system.numbers LIMIT 10;

Result:

┌─number─┬─neighbor(number, 2)─┐
│      0 │                   2 │
│      1 │                   3 │
│      2 │                   4 │
│      3 │                   5 │
│      4 │                   6 │
│      5 │                   7 │
│      6 │                   8 │
│      7 │                   9 │
│      8 │                   0 │
│      9 │                   0 │
└────────┴─────────────────────┘

Query:

SELECT number, neighbor(number, 2, 999) FROM system.numbers LIMIT 10;

Result:

┌─number─┬─neighbor(number, 2, 999)─┐
│      0 │                        2 │
│      1 │                        3 │
│      2 │                        4 │
│      3 │                        5 │
│      4 │                        6 │
│      5 │                        7 │
│      6 │                        8 │
│      7 │                        9 │
│      8 │                      999 │
│      9 │                      999 │
└────────┴──────────────────────────┘

This function can be used to compute year-over-year metric value:

Query:

WITH toDate('2018-01-01') AS start_date
SELECT
    toStartOfMonth(start_date + (number * 32)) AS month,
    toInt32(month) % 100 AS money,
    neighbor(money, -12) AS prev_year,
    round(prev_year / money, 2) AS year_over_year
FROM numbers(16)

Result:

┌──────month─┬─money─┬─prev_year─┬─year_over_year─┐
│ 2018-01-01 │    32 │         0 │              0 │
│ 2018-02-01 │    63 │         0 │              0 │
│ 2018-03-01 │    91 │         0 │              0 │
│ 2018-04-01 │    22 │         0 │              0 │
│ 2018-05-01 │    52 │         0 │              0 │
│ 2018-06-01 │    83 │         0 │              0 │
│ 2018-07-01 │    13 │         0 │              0 │
│ 2018-08-01 │    44 │         0 │              0 │
│ 2018-09-01 │    75 │         0 │              0 │
│ 2018-10-01 │     5 │         0 │              0 │
│ 2018-11-01 │    36 │         0 │              0 │
│ 2018-12-01 │    66 │         0 │              0 │
│ 2019-01-01 │    97 │        32 │           0.33 │
│ 2019-02-01 │    28 │        63 │           2.25 │
│ 2019-03-01 │    56 │        91 │           1.62 │
│ 2019-04-01 │    87 │        22 │           0.25 │
└────────────┴───────┴───────────┴────────────────┘

runningDifference

Calculates the difference between two consecutive row values in the data block. Returns 0 for the first row, and for subsequent rows the difference to the previous row.

Only returns differences inside the currently processed data block. Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.


The result of the function depends on the affected data blocks and the order of data in the block.

The order of rows during calculation of runningDifference() can differ from the order of rows returned to the user. To prevent that you can create a subquery with ORDER BY and call the function from outside the subquery.

Syntax

runningDifference(x)

Example

Query:

SELECT
    EventID,
    EventTime,
    runningDifference(EventTime) AS delta
FROM
(
    SELECT
        EventID,
        EventTime
    FROM events
    WHERE EventDate = '2016-11-24'
    ORDER BY EventTime ASC
    LIMIT 5
)

Result:

┌─EventID─┬───────────EventTime─┬─delta─┐
│    1106 │ 2016-11-24 00:00:04 │     0 │
│    1107 │ 2016-11-24 00:00:05 │     1 │
│    1108 │ 2016-11-24 00:00:05 │     0 │
│    1109 │ 2016-11-24 00:00:09 │     4 │
│    1110 │ 2016-11-24 00:00:10 │     1 │
└─────────┴─────────────────────┴───────┘

Please note that the block size affects the result. The internal state of runningDifference state is reset for each new block.

Query:

SELECT
    number,
    runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1

Result:

┌─number─┬─diff─┐
│      0 │    0 │
└────────┴──────┘
┌─number─┬─diff─┐
│  65536 │    0 │
└────────┴──────┘

Query:

set max_block_size=100000 -- default value is 65536!

SELECT
    number,
    runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1

Result:

┌─number─┬─diff─┐
│      0 │    0 │
└────────┴──────┘

runningDifferenceStartingWithFirstValue

This function is DEPRECATED (see the note for runningDifference).


Same as runningDifference, but returns the value of the first row as the value on the first row.

runningConcurrency

Calculates the number of concurrent events. Each event has a start time and an end time. The start time is included in the event, while the end time is excluded. Columns with a start time and an end time must be of the same data type. The function calculates the total number of active (concurrent) events for each event start time.

Events must be ordered by the start time in ascending order. If this requirement is violated the function raises an exception. Every data block is processed separately. If events from different data blocks overlap then they can not be processed correctly.


Syntax

runningConcurrency(start, end)

Arguments

Returned values

  • The number of concurrent events at each event start time. UInt32

Example

Consider the table:

┌──────start─┬────────end─┐
│ 2021-03-03 │ 2021-03-11 │
│ 2021-03-06 │ 2021-03-12 │
│ 2021-03-07 │ 2021-03-08 │
│ 2021-03-11 │ 2021-03-12 │
└────────────┴────────────┘

Query:

SELECT start, runningConcurrency(start, end) FROM example_table;

Result:

┌──────start─┬─runningConcurrency(start, end)─┐
│ 2021-03-03 │                              1 │
│ 2021-03-06 │                              2 │
│ 2021-03-07 │                              3 │
│ 2021-03-11 │                              2 │
└────────────┴────────────────────────────────┘

MACNumToString

Interprets a UInt64 number as a MAC address in big endian format. Returns the corresponding MAC address in format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form) as string.

Syntax

MACNumToString(num)

MACStringToNum

The inverse function of MACNumToString. If the MAC address has an invalid format, it returns 0.

Syntax

MACStringToNum(s)

MACStringToOUI

Given a MAC address in format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form), returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.

Syntax

MACStringToOUI(s)

getSizeOfEnumType

Returns the number of fields in Enum. An exception is thrown if the type is not Enum.

Syntax

getSizeOfEnumType(value)

Arguments:

  • value — Value of type Enum.

Returned values

  • The number of fields with Enum input values.

Example

SELECT getSizeOfEnumType( CAST('a' AS Enum8('a' = 1, 'b' = 2) ) ) AS x
┌─x─┐
│ 2 │
└───┘

blockSerializedSize

Returns the size on disk without considering compression.

blockSerializedSize(value[, value[, ...]])

Arguments

  • value — Any value.

Returned values

  • The number of bytes that will be written to disk for block of values without compression.

Example

Query:

SELECT blockSerializedSize(maxState(1)) as x

Result:

┌─x─┐
│ 2 │
└───┘

toColumnTypeName

Returns the internal name of the data type that represents the value.

Syntax

toColumnTypeName(value)

Arguments:

  • value — Any type of value.

Returned values

  • The internal data type name used to represent value.

Example

Difference between toTypeName ' and ' toColumnTypeName:

SELECT toTypeName(CAST('2018-01-01 01:02:03' AS DateTime))

Result:

┌─toTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime                                            │
└─────────────────────────────────────────────────────┘

Query:

SELECT toColumnTypeName(CAST('2018-01-01 01:02:03' AS DateTime))

Result:

┌─toColumnTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ Const(UInt32)                                             │
└───────────────────────────────────────────────────────────┘

The example shows that the DateTime data type is internally stored as Const(UInt32).

dumpColumnStructure

Outputs a detailed description of data structures in RAM

dumpColumnStructure(value)

Arguments:

  • value — Any type of value.

Returned values

  • A description of the column structure used for representing value.

Example

SELECT dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))
┌─dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime, Const(size = 1, UInt32(size = 1))                  │
└──────────────────────────────────────────────────────────────┘

defaultValueOfArgumentType

Returns the default value for the given data type.

Does not include default values for custom columns set by the user.

Syntax

defaultValueOfArgumentType(expression)

Arguments:

  • expression — Arbitrary type of value or an expression that results in a value of an arbitrary type.

Returned values

  • 0 for numbers.
  • Empty string for strings.
  • ᴺᵁᴸᴸ for Nullable.

Example

Query:

SELECT defaultValueOfArgumentType( CAST(1 AS Int8) )

Result:

┌─defaultValueOfArgumentType(CAST(1, 'Int8'))─┐
│                                           0 │
└─────────────────────────────────────────────┘

Query:

SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )

Result:

┌─defaultValueOfArgumentType(CAST(1, 'Nullable(Int8)'))─┐
│                                                  ᴺᵁᴸᴸ │
└───────────────────────────────────────────────────────┘

defaultValueOfTypeName

Returns the default value for the given type name.

Does not include default values for custom columns set by the user.

defaultValueOfTypeName(type)

Arguments:

  • type — A string representing a type name.

Returned values

  • 0 for numbers.
  • Empty string for strings.
  • ᴺᵁᴸᴸ for Nullable.

Example

Query:

SELECT defaultValueOfTypeName('Int8')

Result:

┌─defaultValueOfTypeName('Int8')─┐
│                              0 │
└────────────────────────────────┘

Query:

SELECT defaultValueOfTypeName('Nullable(Int8)')

Result:

┌─defaultValueOfTypeName('Nullable(Int8)')─┐
│                                     ᴺᵁᴸᴸ │
└──────────────────────────────────────────┘

indexHint

This function is intended for debugging and introspection. It ignores its argument and always returns 1. The arguments are not evaluated.

But during index analysis, the argument of this function is assumed to be not wrapped in indexHint. This allows to select data in index ranges by the corresponding condition but without further filtering by this condition. The index in ClickHouse is sparse and using indexHint will yield more data than specifying the same condition directly.

Syntax

SELECT * FROM table WHERE indexHint(<expression>)

Returned value

Example

Here is the example of test data from the table ontime.

Table:

SELECT count() FROM ontime
┌─count()─┐
│ 4276457 │
└─────────┘

The table has indexes on the fields (FlightDate, (Year, FlightDate)).

Create a query which does not use the index:

SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k

ClickHouse processed the entire table (Processed 4.28 million rows).

Result:

┌──────────k─┬─count()─┐
│ 2017-01-01 │   13970 │
│ 2017-01-02 │   15882 │
........................
│ 2017-09-28 │   16411 │
│ 2017-09-29 │   16384 │
│ 2017-09-30 │   12520 │
└────────────┴─────────┘

To apply the index, select a specific date:

SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k

ClickHouse now uses the index to process a significantly smaller number of rows (Processed 32.74 thousand rows).

Result:

┌──────────k─┬─count()─┐
│ 2017-09-15 │   16428 │
└────────────┴─────────┘

Now wrap the expression k = '2017-09-15' in function indexHint:

Query:

SELECT
    FlightDate AS k,
    count()
FROM ontime
WHERE indexHint(k = '2017-09-15')
GROUP BY k
ORDER BY k ASC

ClickHouse used the index the same way as previously (Processed 32.74 thousand rows). The expression k = '2017-09-15' was not used when generating the result. In example, the indexHint function allows to see adjacent dates.

Result:

┌──────────k─┬─count()─┐
│ 2017-09-14 │    7071 │
│ 2017-09-15 │   16428 │
│ 2017-09-16 │    1077 │
│ 2017-09-30 │    8167 │
└────────────┴─────────┘

replicate

Creates an array with a single value.

This function is used for the internal implementation of arrayJoin.


Syntax

replicate(x, arr)

Arguments

  • x — The value to fill the result array with.
  • arr — An array. Array.

Returned value

An array of the lame length as arr filled with value x. Array.

Example

Query:

SELECT replicate(1, ['a', 'b', 'c']);

Result:

┌─replicate(1, ['a', 'b', 'c'])─┐
│ [1,1,1]                       │
└───────────────────────────────┘

revision

Returns the current ClickHouse server revision.

Syntax

revision()

Returned value

  • The current ClickHouse server revision. UInt32.

Example

Query:

SELECT revision();

Result:

┌─revision()─┐
│      54485 │
└────────────┘

filesystemAvailable

Returns the amount of free space in the filesystem hosting the database persistence. The returned value is always smaller than total free space (filesystemFree) because some space is reserved for the operating system.

Syntax

filesystemAvailable()

Returned value

  • The amount of remaining space available in bytes. UInt64.

Example

Query:

SELECT formatReadableSize(filesystemAvailable()) AS "Available space";

Result:

┌─Available space─┐
│ 30.75 GiB       │
└─────────────────┘

filesystemUnreserved

Returns the total amount of the free space on the filesystem hosting the database persistence. (previously filesystemFree). See also filesystemAvailable.

Syntax

filesystemUnreserved()

Returned value

  • The amount of free space in bytes. UInt64.

Example

Query:

SELECT formatReadableSize(filesystemUnreserved()) AS "Free space";

Result:

┌─Free space─┐
│ 32.39 GiB  │
└────────────┘

filesystemCapacity

Returns the capacity of the filesystem in bytes. Needs the path to the data directory to be configured.

Syntax

filesystemCapacity()

Returned value

  • Capacity of the filesystem in bytes. UInt64.

Example

Query:

SELECT formatReadableSize(filesystemCapacity()) AS "Capacity";

Result:

┌─Capacity──┐
│ 39.32 GiB │
└───────────┘

initializeAggregation

Calculates the result of an aggregate function based on a single value. This function can be used to initialize aggregate functions with combinator -State. You can create states of aggregate functions and insert them to columns of type AggregateFunction or use initialized aggregates as default values.

Syntax

initializeAggregation (aggregate_function, arg1, arg2, ..., argN)

Arguments

  • aggregate_function — Name of the aggregation function to initialize. String.
  • arg — Arguments of aggregate function.

Returned value(s)

  • Result of aggregation for every row passed to the function.

The return type is the same as the return type of function, that initializeAggregation takes as first argument.

Example

Query:

SELECT uniqMerge(state) FROM (SELECT initializeAggregation('uniqState', number % 3) AS state FROM numbers(10000));

Result:

┌─uniqMerge(state)─┐
│                3 │
└──────────────────┘

Query:

SELECT finalizeAggregation(state), toTypeName(state) FROM (SELECT initializeAggregation('sumState', number % 3) AS state FROM numbers(5));

Result:

┌─finalizeAggregation(state)─┬─toTypeName(state)─────────────┐
│                          0 │ AggregateFunction(sum, UInt8) │
│                          1 │ AggregateFunction(sum, UInt8) │
│                          2 │ AggregateFunction(sum, UInt8) │
│                          0 │ AggregateFunction(sum, UInt8) │
│                          1 │ AggregateFunction(sum, UInt8) │
└────────────────────────────┴───────────────────────────────┘

Example with AggregatingMergeTree table engine and AggregateFunction column:

CREATE TABLE metrics
(
    key UInt64,
    value AggregateFunction(sum, UInt64) DEFAULT initializeAggregation('sumState', toUInt64(0))
)
ENGINE = AggregatingMergeTree
ORDER BY key
INSERT INTO metrics VALUES (0, initializeAggregation('sumState', toUInt64(42)))

See Also

finalizeAggregation

Given a state of aggregate function, this function returns the result of aggregation (or finalized state when using a -State combinator).

Syntax

finalizeAggregation(state)

Arguments

Returned value(s)

  • Value/values that was aggregated.

The return type is equal to that of any types which were aggregated.


Examples

Query:

SELECT finalizeAggregation(( SELECT countState(number) FROM numbers(10)));

Result:

┌─finalizeAggregation(_subquery16)─┐
│                               10 │
└──────────────────────────────────┘

Query:

SELECT finalizeAggregation(( SELECT sumState(number) FROM numbers(10)));

Result:

┌─finalizeAggregation(_subquery20)─┐
│                               45 │
└──────────────────────────────────┘

Note that NULL values are ignored.

Query:

SELECT finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]));

Result:

┌─finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]))─┐
│                                                          2 │
└────────────────────────────────────────────────────────────┘

Combined example:

Query:

WITH initializeAggregation('sumState', number) AS one_row_sum_state
SELECT
    number,
    finalizeAggregation(one_row_sum_state) AS one_row_sum,
    runningAccumulate(one_row_sum_state) AS cumulative_sum
FROM numbers(10);

Result:

┌─number─┬─one_row_sum─┬─cumulative_sum─┐
│      0 │           0 │              0 │
│      1 │           1 │              1 │
│      2 │           2 │              3 │
│      3 │           3 │              6 │
│      4 │           4 │             10 │
│      5 │           5 │             15 │
│      6 │           6 │             21 │
│      7 │           7 │             28 │
│      8 │           8 │             36 │
│      9 │           9 │             45 │
└────────┴─────────────┴────────────────┘

See Also

runningAccumulate

Accumulates the states of an aggregate function for each row of a data block.

The state is reset for each new block of data. Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.


Syntax

runningAccumulate(agg_state[, grouping]);

Arguments

  • agg_state — State of the aggregate function. AggregateFunction.
  • grouping — Grouping key. Optional. The state of the function is reset if the grouping value is changed. It can be any of the supported data types for which the equality operator is defined.

Returned value

  • Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. runningAccumulate resets states for each new data block or when the grouping value changes.

Type depends on the aggregate function used.

Examples

Consider how you can use runningAccumulate to find the cumulative sum of numbers without and with grouping.

Query:

SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);

Result:

┌─k─┬─res─┐
│ 0 │   0 │
│ 1 │   1 │
│ 2 │   3 │
│ 3 │   6 │
│ 4 │  10 │
│ 5 │  15 │
│ 6 │  21 │
│ 7 │  28 │
│ 8 │  36 │
│ 9 │  45 │
└───┴─────┘

The subquery generates sumState for every number from 0 to 9. sumState returns the state of the sum function that contains the sum of a single number.

The whole query does the following:

  1. For the first row, runningAccumulate takes sumState(0) and returns 0.
  2. For the second row, the function merges sumState(0) and sumState(1) resulting in sumState(0 + 1), and returns 1 as a result.
  3. For the third row, the function merges sumState(0 + 1) and sumState(2) resulting in sumState(0 + 1 + 2), and returns 3 as a result.
  4. The actions are repeated until the block ends.

The following example shows the groupping parameter usage:

Query:

SELECT
    grouping,
    item,
    runningAccumulate(state, grouping) AS res
FROM
(
    SELECT
        toInt8(number / 4) AS grouping,
        number AS item,
        sumState(number) AS state
    FROM numbers(15)
    GROUP BY item
    ORDER BY item ASC
);

Result:

┌─grouping─┬─item─┬─res─┐
│        0 │    0 │   0 │
│        0 │    1 │   1 │
│        0 │    2 │   3 │
│        0 │    3 │   6 │
│        1 │    4 │   4 │
│        1 │    5 │   9 │
│        1 │    6 │  15 │
│        1 │    7 │  22 │
│        2 │    8 │   8 │
│        2 │    9 │  17 │
│        2 │   10 │  27 │
│        2 │   11 │  38 │
│        3 │   12 │  12 │
│        3 │   13 │  25 │
│        3 │   14 │  39 │
└──────────┴──────┴─────┘

As you can see, runningAccumulate merges states for each group of rows separately.

joinGet

The function lets you extract data from the table the same way as from a dictionary. Gets the data from Join tables using the specified join key.

Only supports tables created with the ENGINE = Join(ANY, LEFT, <join_keys>) statement.


Syntax

joinGet(join_storage_table_name, `value_column`, join_keys)

Arguments

  • join_storage_table_name — an identifier indicating where the search is performed.
  • value_column — name of the column of the table that contains required data.
  • join_keys — list of keys.

The identifier is searched for in the default database (see setting default_database in the config file). To override the default database, use USE db_name or specify the database and the table through the separator db_name.db_table as in the example.


Returned value

  • Returns a list of values corresponded to the list of keys.

If a certain key does not exist in source table then 0 or null will be returned based on join_use_nulls setting during table creation. More info about join_use_nulls in Join operation.


Example

Input table:

CREATE DATABASE db_test;
CREATE TABLE db_test.id_val(`id` UInt32, `val` UInt32) ENGINE = Join(ANY, LEFT, id);
INSERT INTO db_test.id_val VALUES (1, 11)(2, 12)(4, 13);
SELECT * FROM db_test.id_val;
┌─id─┬─val─┐
│  4 │  13 │
│  2 │  12 │
│  1 │  11 │
└────┴─────┘

Query:

SELECT number, joinGet(db_test.id_val, 'val', toUInt32(number)) from numbers(4);

Result:

   ┌─number─┬─joinGet('db_test.id_val', 'val', toUInt32(number))─┐
1. │      0 │                                                  0 │
2. │      1 │                                                 11 │
3. │      2 │                                                 12 │
4. │      3 │                                                  0 │
   └────────┴────────────────────────────────────────────────────┘

Setting join_use_nulls can be used during table creation to change the behaviour of what gets returned if no key exists in the source table.

CREATE DATABASE db_test;
CREATE TABLE db_test.id_val_nulls(`id` UInt32, `val` UInt32) ENGINE = Join(ANY, LEFT, id) SETTINGS join_use_nulls=1;
INSERT INTO db_test.id_val_nulls VALUES (1, 11)(2, 12)(4, 13);
SELECT * FROM db_test.id_val_nulls;
┌─id─┬─val─┐
│  4 │  13 │
│  2 │  12 │
│  1 │  11 │
└────┴─────┘

Query:

SELECT number, joinGet(db_test.id_val_nulls, 'val', toUInt32(number)) from numbers(4);

Result:

   ┌─number─┬─joinGet('db_test.id_val_nulls', 'val', toUInt32(number))─┐
1. │      0 │                                                     ᴺᵁᴸᴸ │
2. │      1 │                                                       11 │
3. │      2 │                                                       12 │
4. │      3 │                                                     ᴺᵁᴸᴸ │
   └────────┴──────────────────────────────────────────────────────────┘

joinGetOrNull

Like joinGet but returns NULL when the key is missing instead of returning the default value.

Syntax

joinGetOrNull(join_storage_table_name, `value_column`, join_keys)

Arguments

  • join_storage_table_name — an identifier indicating where the search is performed.
  • value_column — name of the column of the table that contains required data.
  • join_keys — list of keys.

The identifier is searched for in the default database (see setting default_database in the config file). To override the default database, use USE db_name or specify the database and the table through the separator db_name.db_table as in the example.


Returned value

  • Returns a list of values corresponded to the list of keys.

If a certain key does not exist in source table then NULL is returned for that key.


Example

Input table:

CREATE DATABASE db_test;
CREATE TABLE db_test.id_val(`id` UInt32, `val` UInt32) ENGINE = Join(ANY, LEFT, id);
INSERT INTO db_test.id_val VALUES (1, 11)(2, 12)(4, 13);
SELECT * FROM db_test.id_val;
┌─id─┬─val─┐
│  4 │  13 │
│  2 │  12 │
│  1 │  11 │
└────┴─────┘

Query:

SELECT number, joinGetOrNull(db_test.id_val, 'val', toUInt32(number)) from numbers(4);

Result:

   ┌─number─┬─joinGetOrNull('db_test.id_val', 'val', toUInt32(number))─┐
1. │      0 │                                                     ᴺᵁᴸᴸ │
2. │      1 │                                                       11 │
3. │      2 │                                                       12 │
4. │      3 │                                                     ᴺᵁᴸᴸ │
   └────────┴──────────────────────────────────────────────────────────┘

catboostEvaluate

This function is not available in ClickHouse Cloud.


Evaluate an external catboost model. CatBoost is an open-source gradient boosting library developed by Yandex for machine learning. Accepts a path to a catboost model and model arguments (features). Returns Float64.

Syntax

catboostEvaluate(path_to_model, feature_1, feature_2, ..., feature_n)

Example

SELECT feat1, ..., feat_n, catboostEvaluate('/path/to/model.bin', feat_1, ..., feat_n) AS prediction
FROM data_table

Prerequisites

  1. Build the catboost evaluation library

Before evaluating catboost models, the libcatboostmodel.<so|dylib> library must be made available. See CatBoost documentation how to compile it.

Next, specify the path to libcatboostmodel.<so|dylib> in the clickhouse configuration:

<clickhouse>
...
    <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
...
</clickhouse>

For security and isolation reasons, the model evaluation does not run in the server process but in the clickhouse-library-bridge process. At the first execution of catboostEvaluate(), the server starts the library bridge process if it is not running already. Both processes communicate using a HTTP interface. By default, port 9012 is used. A different port can be specified as follows - this is useful if port 9012 is already assigned to a different service.

<library_bridge>
    <port>9019</port>
</library_bridge>
  1. Train a catboost model using libcatboost

See Training and applying models for how to train catboost models from a training data set.

throwIf

Throw an exception if argument x is true.

Syntax

throwIf(x[, message[, error_code]])

Arguments

  • x - the condition to check.
  • message - a constant string providing a custom error message. Optional.
  • error_code - A constant integer providing a custom error code. Optional.

To use the error_code argument, configuration parameter allow_custom_error_code_in_throwif must be enabled.

Example

SELECT throwIf(number = 3, 'Too many') FROM numbers(10);

Result:

↙ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) Received exception from server (version 19.14.1):
Code: 395. DB::Exception: Received from localhost:9000. DB::Exception: Too many.

identity

Returns its argument. Intended for debugging and testing. Allows to cancel using index, and get the query performance of a full scan. When the query is analyzed for possible use of an index, the analyzer ignores everything in identity functions. Also disables constant folding.

Syntax

identity(x)

Example

Query:

SELECT identity(42);

Result:

┌─identity(42)─┐
│           42 │
└──────────────┘

getSetting

Returns the current value of a custom setting.

Syntax

getSetting('custom_setting');

Parameter

  • custom_setting — The setting name. String.

Returned value

  • The setting’s current value.

Example

SET custom_a = 123;
SELECT getSetting('custom_a');

Result:

123

See Also

isDecimalOverflow

Checks whether the Decimal value is outside its precision or outside the specified precision.

Syntax

isDecimalOverflow(d, [p])

Arguments

  • d — value. Decimal.
  • p — precision. Optional. If omitted, the initial precision of the first argument is used. This parameter can be helpful to migrate data from/to another database or file. UInt8.

Returned values

  • 1 — Decimal value has more digits then allowed by its precision,
  • 0 — Decimal value satisfies the specified precision.

Example

Query:

SELECT isDecimalOverflow(toDecimal32(1000000000, 0), 9),
       isDecimalOverflow(toDecimal32(1000000000, 0)),
       isDecimalOverflow(toDecimal32(-1000000000, 0), 9),
       isDecimalOverflow(toDecimal32(-1000000000, 0));

Result:

1   1   1   1

countDigits

Returns number of decimal digits need to represent a value.

Syntax

countDigits(x)

Arguments

Returned value

For Decimal values takes into account their scales: calculates result over underlying integer type which is (value * scale). For example: countDigits(42) = 2, countDigits(42.000) = 5, countDigits(0.04200) = 4. I.e. you may check decimal overflow for Decimal64 with countDecimal(x) > 18. It’s a slow variant of isDecimalOverflow.


Example

Query:

SELECT countDigits(toDecimal32(1, 9)), countDigits(toDecimal32(-1, 9)),
       countDigits(toDecimal64(1, 18)), countDigits(toDecimal64(-1, 18)),
       countDigits(toDecimal128(1, 38)), countDigits(toDecimal128(-1, 38));

Result:

10  10  19  19  39  39

errorCodeToName

Syntax

errorCodeToName(1)

Result:

UNSUPPORTED_METHOD

tcpPort

Returns native interface TCP port number listened by this server. If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

tcpPort()

Arguments

  • None.

Returned value

Example

Query:

SELECT tcpPort();

Result:

┌─tcpPort()─┐
│      9000 │
└───────────┘

See Also

currentProfiles

Returns a list of the current settings profiles for the current user.

The command SET PROFILE could be used to change the current setting profile. If the command SET PROFILE was not used the function returns the profiles specified at the current user’s definition (see CREATE USER).

Syntax

currentProfiles()

Returned value

  • List of the current user settings profiles. Array(String).

enabledProfiles

Returns settings profiles, assigned to the current user both explicitly and implicitly. Explicitly assigned profiles are the same as returned by the currentProfiles function. Implicitly assigned profiles include parent profiles of other assigned profiles, profiles assigned via granted roles, profiles assigned via their own settings, and the main default profile (see the default_profile section in the main server configuration file).

Syntax

enabledProfiles()

Returned value

defaultProfiles

Returns all the profiles specified at the current user’s definition (see CREATE USER statement).

Syntax

defaultProfiles()

Returned value

currentRoles

Returns the roles assigned to the current user. The roles can be changed by the SET ROLE statement. If no SET ROLE statement was not, the function currentRoles returns the same as defaultRoles.

Syntax

currentRoles()

Returned value

  • A list of the current roles for the current user. Array(String).

enabledRoles

Returns the names of the current roles and the roles, granted to some of the current roles.

Syntax

enabledRoles()

Returned value

  • List of the enabled roles for the current user. Array(String).

defaultRoles

Returns the roles which are enabled by default for the current user when he logs in. Initially these are all roles granted to the current user (see GRANT), but that can be changed with the SET DEFAULT ROLE statement.

Syntax

defaultRoles()

Returned value

  • List of the default roles for the current user. Array(String).

getServerPort

Returns the server port number. When the port is not used by the server, throws an exception.

Syntax

getServerPort(port_name)

Arguments

  • port_name — The name of the server port. String. Possible values:
    • ‘tcp_port’
    • ‘tcp_port_secure’
    • ‘http_port’
    • ‘https_port’
    • ‘interserver_http_port’
    • ‘interserver_https_port’
    • ‘mysql_port’
    • ‘postgresql_port’
    • ‘grpc_port’
    • ‘prometheus.port’

Returned value

  • The number of the server port. UInt16.

Example

Query:

SELECT getServerPort('tcp_port');

Result:

┌─getServerPort('tcp_port')─┐
│ 9000                      │
└───────────────────────────┘

queryID

Returns the ID of the current query. Other parameters of a query can be extracted from the system.query_log table via query_id.

In contrast to initialQueryID function, queryID can return different results on different shards (see the example).

Syntax

queryID()

Returned value

  • The ID of the current query. String

Example

Query:

CREATE TABLE tmp (str String) ENGINE = Log;
INSERT INTO tmp (*) VALUES ('a');
SELECT count(DISTINCT t) FROM (SELECT queryID() AS t FROM remote('127.0.0.{1..3}', currentDatabase(), 'tmp') GROUP BY queryID());

Result:

┌─count()─┐
│ 3       │
└─────────┘

initialQueryID

Returns the ID of the initial current query. Other parameters of a query can be extracted from the system.query_log table via initial_query_id.

In contrast to queryID function, initialQueryID returns the same results on different shards (see example).

Syntax

initialQueryID()

Returned value

  • The ID of the initial current query. String

Example

Query:

CREATE TABLE tmp (str String) ENGINE = Log;
INSERT INTO tmp (*) VALUES ('a');
SELECT count(DISTINCT t) FROM (SELECT initialQueryID() AS t FROM remote('127.0.0.{1..3}', currentDatabase(), 'tmp') GROUP BY queryID());

Result:

┌─count()─┐
│ 1       │
└─────────┘

partitionID

Computes the partition ID.

This function is slow and should not be called for large amount of rows.


Syntax

partitionID(x[, y, ...]);

Arguments

  • x — Column for which to return the partition ID.
  • y, ... — Remaining N columns for which to return the partition ID (optional).

Returned Value

  • Partition ID that the row would belong to. String.

Example

Query:

DROP TABLE IF EXISTS tab;

CREATE TABLE tab
(
  i int,
  j int
)
ENGINE = MergeTree
PARTITION BY i
ORDER BY tuple();

INSERT INTO tab VALUES (1, 1), (1, 2), (1, 3), (2, 4), (2, 5), (2, 6);

SELECT i, j, partitionID(i), _partition_id FROM tab ORDER BY i, j;

Result:

┌─i─┬─j─┬─partitionID(i)─┬─_partition_id─┐
│ 1 │ 1 │ 1              │ 1             │
│ 1 │ 2 │ 1              │ 1             │
│ 1 │ 3 │ 1              │ 1             │
└───┴───┴────────────────┴───────────────┘
┌─i─┬─j─┬─partitionID(i)─┬─_partition_id─┐
│ 2 │ 4 │ 2              │ 2             │
│ 2 │ 5 │ 2              │ 2             │
│ 2 │ 6 │ 2              │ 2             │
└───┴───┴────────────────┴───────────────┘

shardNum

Returns the index of a shard which processes a part of data in a distributed query. Indices are started from 1. If a query is not distributed then constant value 0 is returned.

Syntax

shardNum()

Returned value

  • Shard index or constant 0. UInt32.

Example

In the following example a configuration with two shards is used. The query is executed on the system.one table on every shard.

Query:

CREATE TABLE shard_num_example (dummy UInt8)
    ENGINE=Distributed(test_cluster_two_shards_localhost, system, one, dummy);
SELECT dummy, shardNum(), shardCount() FROM shard_num_example;

Result:

┌─dummy─┬─shardNum()─┬─shardCount()─┐
│     0 │          2 │            2 │
│     0 │          1 │            2 │
└───────┴────────────┴──────────────┘

See Also

shardCount

Returns the total number of shards for a distributed query. If a query is not distributed then constant value 0 is returned.

Syntax

shardCount()

Returned value

  • Total number of shards or 0. UInt32.

See Also

  • shardNum() function example also contains shardCount() function call.

getOSKernelVersion

Returns a string with the current OS kernel version.

Syntax

getOSKernelVersion()

Arguments

  • None.

Returned value

  • The current OS kernel version. String.

Example

Query:

SELECT getOSKernelVersion();

Result:

┌─getOSKernelVersion()────┐
│ Linux 4.15.0-55-generic │
└─────────────────────────┘

zookeeperSessionUptime

Returns the uptime of the current ZooKeeper session in seconds.

Syntax

zookeeperSessionUptime()

Arguments

  • None.

Returned value

  • Uptime of the current ZooKeeper session in seconds. UInt32.

Example

Query:

SELECT zookeeperSessionUptime();

Result:

┌─zookeeperSessionUptime()─┐
│                      286 │
└──────────────────────────┘

generateRandomStructure

Generates random table structure in a format column1_name column1_type, column2_name column2_type, ....

Syntax

generateRandomStructure([number_of_columns, seed])

Arguments

  • number_of_columns — The desired number of columns in the result table structure. If set to 0 or Null, the number of columns will be random from 1 to 128. Default value: Null.
  • seed - Random seed to produce stable results. If seed is not specified or set to Null, it is randomly generated.

All arguments must be constant.

Returned value

  • Randomly generated table structure. String.

Examples

Query:

SELECT generateRandomStructure()

Result:

┌─generateRandomStructure()─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 Decimal32(5), c2 Date, c3 Tuple(LowCardinality(String), Int128, UInt64, UInt16, UInt8, IPv6), c4 Array(UInt128), c5 UInt32, c6 IPv4, c7 Decimal256(64), c8 Decimal128(3), c9 UInt256, c10 UInt64, c11 DateTime │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Query:

SELECT generateRandomStructure(1)

Result:

┌─generateRandomStructure(1)─┐
│ c1 Map(UInt256, UInt16)    │
└────────────────────────────┘

Query:

SELECT generateRandomStructure(NULL, 33)

Result:

┌─generateRandomStructure(NULL, 33)─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 DateTime, c2 Enum8('c2V0' = 0, 'c2V1' = 1, 'c2V2' = 2, 'c2V3' = 3), c3 LowCardinality(Nullable(FixedString(30))), c4 Int16, c5 Enum8('c5V0' = 0, 'c5V1' = 1, 'c5V2' = 2, 'c5V3' = 3), c6 Nullable(UInt8), c7 String, c8 Nested(e1 IPv4, e2 UInt8, e3 UInt16, e4 UInt16, e5 Int32, e6 Map(Date, Decimal256(70))) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Note: the maximum nesting depth of complex types (Array, Tuple, Map, Nested) is limited to 16.

This function can be used together with generateRandom to generate completely random tables.

structureToCapnProtoSchema

Converts ClickHouse table structure to CapnProto schema.

Syntax

structureToCapnProtoSchema(structure)

Arguments

  • structure — Table structure in a format column1_name column1_type, column2_name column2_type, ....
  • root_struct_name — Name for root struct in CapnProto schema. Default value - Message;

Returned value

Examples

Query:

SELECT structureToCapnProtoSchema('column1 String, column2 UInt32, column3 Array(String)') FORMAT RawBLOB

Result:

@0xf96402dd754d0eb7;

struct Message
{
    column1 @0 : Data;
    column2 @1 : UInt32;
    column3 @2 : List(Data);
}

Query:

SELECT structureToCapnProtoSchema('column1 Nullable(String), column2 Tuple(element1 UInt32, element2 Array(String)), column3 Map(String, String)') FORMAT RawBLOB

Result:

@0xd1c8320fecad2b7f;

struct Message
{
    struct Column1
    {
        union
        {
            value @0 : Data;
            null @1 : Void;
        }
    }
    column1 @0 : Column1;
    struct Column2
    {
        element1 @0 : UInt32;
        element2 @1 : List(Data);
    }
    column2 @1 : Column2;
    struct Column3
    {
        struct Entry
        {
            key @0 : Data;
            value @1 : Data;
        }
        entries @0 : List(Entry);
    }
    column3 @2 : Column3;
}

Query:

SELECT structureToCapnProtoSchema('column1 String, column2 UInt32', 'Root') FORMAT RawBLOB

Result:

@0x96ab2d4ab133c6e1;

struct Root
{
    column1 @0 : Data;
    column2 @1 : UInt32;
}

structureToProtobufSchema

Converts ClickHouse table structure to Protobuf schema.

Syntax

structureToProtobufSchema(structure)

Arguments

  • structure — Table structure in a format column1_name column1_type, column2_name column2_type, ....
  • root_message_name — Name for root message in Protobuf schema. Default value - Message;

Returned value

Examples

Query:

SELECT structureToProtobufSchema('column1 String, column2 UInt32, column3 Array(String)') FORMAT RawBLOB

Result:

syntax = "proto3";

message Message
{
    bytes column1 = 1;
    uint32 column2 = 2;
    repeated bytes column3 = 3;
}

Query:

SELECT structureToProtobufSchema('column1 Nullable(String), column2 Tuple(element1 UInt32, element2 Array(String)), column3 Map(String, String)') FORMAT RawBLOB

Result:

syntax = "proto3";

message Message
{
    bytes column1 = 1;
    message Column2
    {
        uint32 element1 = 1;
        repeated bytes element2 = 2;
    }
    Column2 column2 = 2;
    map<string, bytes> column3 = 3;
}

Query:

SELECT structureToProtobufSchema('column1 String, column2 UInt32', 'Root') FORMAT RawBLOB

Result:

syntax = "proto3";

message Root
{
    bytes column1 = 1;
    uint32 column2 = 2;
}

formatQuery

Returns a formatted, possibly multi-line, version of the given SQL query.

Throws an exception if the query is not well-formed. To return NULL instead, function formatQueryOrNull() may be used.

Syntax

formatQuery(query)
formatQueryOrNull(query)

Arguments

  • query - The SQL query to be formatted. String

Returned value

Example

SELECT formatQuery('select a,    b FRom tab WHERE a > 3 and  b < 3');

Result:

┌─formatQuery('select a,    b FRom tab WHERE a > 3 and  b < 3')─┐
│ SELECT
    a,
    b
FROM tab
WHERE (a > 3) AND (b < 3)            │
└───────────────────────────────────────────────────────────────┘

formatQuerySingleLine

Like formatQuery() but the returned formatted string contains no line breaks.

Throws an exception if the query is not well-formed. To return NULL instead, function formatQuerySingleLineOrNull() may be used.

Syntax

formatQuerySingleLine(query)
formatQuerySingleLineOrNull(query)

Arguments

  • query - The SQL query to be formatted. String

Returned value

Example

SELECT formatQuerySingleLine('select a,    b FRom tab WHERE a > 3 and  b < 3');

Result:

┌─formatQuerySingleLine('select a,    b FRom tab WHERE a > 3 and  b < 3')─┐
│ SELECT a, b FROM tab WHERE (a > 3) AND (b < 3)                          │
└─────────────────────────────────────────────────────────────────────────┘

variantElement

Extracts a column with specified type from a Variant column.

Syntax

variantElement(variant, type_name, [, default_value])

Arguments

  • variant — Variant column. Variant.
  • type_name — The name of the variant type to extract. String.
  • default_value - The default value that will be used if variant doesn’t have variant with specified type. Can be any type. Optional.

Returned value

  • Subcolumn of a Variant column with specified type.

Example

CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT v, variantElement(v, 'String'), variantElement(v, 'UInt64'), variantElement(v, 'Array(UInt64)') FROM test;
┌─v─────────────┬─variantElement(v, 'String')─┬─variantElement(v, 'UInt64')─┬─variantElement(v, 'Array(UInt64)')─┐
│ ᴺᵁᴸᴸ          │ ᴺᵁᴸᴸ                        │                        ᴺᵁᴸᴸ │ []                                 │
│ 42            │ ᴺᵁᴸᴸ                        │                          42 │ []                                 │
│ Hello, World! │ Hello, World!               │                        ᴺᵁᴸᴸ │ []                                 │
│ [1,2,3]       │ ᴺᵁᴸᴸ                        │                        ᴺᵁᴸᴸ │ [1,2,3]                            │
└───────────────┴─────────────────────────────┴─────────────────────────────┴────────────────────────────────────┘

variantType

Returns the variant type name for each row of Variant column. If row contains NULL, it returns 'None' for it.

Syntax

variantType(variant)

Arguments

  • variant — Variant column. Variant.

Returned value

  • Enum8 column with variant type name for each row.

Example

CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT variantType(v) FROM test;
┌─variantType(v)─┐
│ None           │
│ UInt64         │
│ String         │
│ Array(UInt64)  │
└────────────────┘
SELECT toTypeName(variantType(v)) FROM test LIMIT 1;
┌─toTypeName(variantType(v))──────────────────────────────────────────┐
│ Enum8('None' = -1, 'Array(UInt64)' = 0, 'String' = 1, 'UInt64' = 2) │
└─────────────────────────────────────────────────────────────────────┘

minSampleSizeConversion

Calculates minimum required sample size for an A/B test comparing conversions (proportions) in two samples.

Syntax

minSampleSizeConversion(baseline, mde, power, alpha)

Uses the formula described in this article. Assumes equal sizes of treatment and control groups. Returns the sample size required for one group (i.e. the sample size required for the whole experiment is twice the returned value).

Arguments

  • baseline — Baseline conversion. Float.
  • mde — Minimum detectable effect (MDE) as percentage points (e.g. for a baseline conversion 0.25 the MDE 0.03 means an expected change to 0.25 ± 0.03). Float.
  • power — Required statistical power of a test (1 - probability of Type II error). Float.
  • alpha — Required significance level of a test (probability of Type I error). Float.

Returned value

A named Tuple with 3 elements:

  • "minimum_sample_size" — Required sample size. Float64.
  • "detect_range_lower" — Lower bound of the range of values not detectable with the returned required sample size (i.e. all values less than or equal to "detect_range_lower" are detectable with the provided alpha and power). Calculated as baseline - mde. Float64.
  • "detect_range_upper" — Upper bound of the range of values not detectable with the returned required sample size (i.e. all values greater than or equal to "detect_range_upper" are detectable with the provided alpha and power). Calculated as baseline + mde. Float64.

Example

The following query calculates the required sample size for an A/B test with baseline conversion of 25%, MDE of 3%, significance level of 5%, and the desired statistical power of 80%:

SELECT minSampleSizeConversion(0.25, 0.03, 0.80, 0.05) AS sample_size;

Result:

┌─sample_size───────────────────┐
│ (3396.077603219163,0.22,0.28) │
└───────────────────────────────┘

minSampleSizeContinuous

Calculates minimum required sample size for an A/B test comparing means of a continuous metric in two samples.

Syntax

minSampleSizeContinous(baseline, sigma, mde, power, alpha)

Alias: minSampleSizeContinous

Uses the formula described in this article. Assumes equal sizes of treatment and control groups. Returns the required sample size for one group (i.e. the sample size required for the whole experiment is twice the returned value). Also assumes equal variance of the test metric in treatment and control groups.

Arguments

  • baseline — Baseline value of a metric. Integer or Float.
  • sigma — Baseline standard deviation of a metric. Integer or Float.
  • mde — Minimum detectable effect (MDE) as percentage of the baseline value (e.g. for a baseline value 112.25 the MDE 0.03 means an expected change to 112.25 ± 112.25*0.03). Integer or Float.
  • power — Required statistical power of a test (1 - probability of Type II error). Integer or Float.
  • alpha — Required significance level of a test (probability of Type I error). Integer or Float.

Returned value

A named Tuple with 3 elements:

  • "minimum_sample_size" — Required sample size. Float64.
  • "detect_range_lower" — Lower bound of the range of values not detectable with the returned required sample size (i.e. all values less than or equal to "detect_range_lower" are detectable with the provided alpha and power). Calculated as baseline * (1 - mde). Float64.
  • "detect_range_upper" — Upper bound of the range of values not detectable with the returned required sample size (i.e. all values greater than or equal to "detect_range_upper" are detectable with the provided alpha and power). Calculated as baseline * (1 + mde). Float64.

Example

The following query calculates the required sample size for an A/B test on a metric with baseline value of 112.25, standard deviation of 21.1, MDE of 3%, significance level of 5%, and the desired statistical power of 80%:

SELECT minSampleSizeContinous(112.25, 21.1, 0.03, 0.80, 0.05) AS sample_size;

Result:

┌─sample_size───────────────────────────┐
│ (616.2931945826209,108.8825,115.6175) │
└───────────────────────────────────────┘

connectionId

Retrieves the connection ID of the client that submitted the current query and returns it as a UInt64 integer.

Syntax

connectionId()

Parameters

None.

Returned value

Returns an integer of type UInt64.

Implementation details

This function is most useful in debugging scenarios or for internal purposes within the MySQL handler. It was created for compatibility with MySQL’s CONNECTION_ID function It is not typically used in production queries.

Example

Query:

SELECT connectionId();
0

connection_id

An alias of connectionId. Retrieves the connection ID of the client that submitted the current query and returns it as a UInt64 integer.

Syntax

connection_id()

Parameters

None.

Returned value

Returns an integer of type UInt64.

Implementation details

This function is most useful in debugging scenarios or for internal purposes within the MySQL handler. It was created for compatibility with MySQL’s CONNECTION_ID function It is not typically used in production queries.

Example

Query:

SELECT connection_id();
0

getClientHTTPHeader

Get the value of an HTTP header.

If there is no such header or the current request is not performed via the HTTP interface, the function returns an empty string. Certain HTTP headers (e.g., Authentication and X-ClickHouse-*) are restricted.

The function requires the setting allow_get_client_http_header to be enabled. The setting is not enabled by default for security reasons, because some headers, such as Cookie, could contain sensitive info.

HTTP headers are case sensitive for this function.

If the function is used in the context of a distributed query, it returns non-empty result only on the initiator node.

showCertificate

Shows information about the current server’s Secure Sockets Layer (SSL) certificate if it has been configured. See Configuring SSL-TLS for more information on how to configure ClickHouse to use OpenSSL certificates to validate connections.

Syntax

showCertificate()

Returned value

  • Map of key-value pairs relating to the configured SSL certificate. Map(String, String).

Example

Query:

SELECT showCertificate() FORMAT LineAsString;

Result:

{'version':'1','serial_number':'2D9071D64530052D48308473922C7ADAFA85D6C5','signature_algo':'sha256WithRSAEncryption','issuer':'/CN=marsnet.local CA','not_before':'May  7 17:01:21 2024 GMT','not_after':'May  7 17:01:21 2025 GMT','subject':'/CN=chnode1','pkey_algo':'rsaEncryption'}

lowCardinalityIndices

Returns the position of a value in the dictionary of a LowCardinality column. Positions start at 1. Since LowCardinality have per-part dictionaries, this function may return different positions for the same value in different parts.

Syntax

lowCardinalityIndices(col)

Arguments

Returned value

  • The position of the value in the dictionary of the current part. UInt64.

Example

Query:

DROP TABLE IF EXISTS test;
CREATE TABLE test (s LowCardinality(String)) ENGINE = Memory;

-- create two parts:

INSERT INTO test VALUES ('ab'), ('cd'), ('ab'), ('ab'), ('df');
INSERT INTO test VALUES ('ef'), ('cd'), ('ab'), ('cd'), ('ef');

SELECT s, lowCardinalityIndices(s) FROM test;

Result:

   ┌─s──┬─lowCardinalityIndices(s)─┐
1. │ ab │                        1 │
2. │ cd │                        2 │
3. │ ab │                        1 │
4. │ ab │                        1 │
5. │ df │                        3 │
   └────┴──────────────────────────┘
    ┌─s──┬─lowCardinalityIndices(s)─┐
 6. │ ef │                        1 │
 7. │ cd │                        2 │
 8. │ ab │                        3 │
 9. │ cd │                        2 │
10. │ ef │                        1 │
    └────┴──────────────────────────┘

lowCardinalityKeys

Returns the dictionary values of a LowCardinality column. If the block is smaller or larger than the dictionary size, the result will be truncated or extended with default values. Since LowCardinality have per-part dictionaries, this function may return different dictionary values in different parts.

Syntax

lowCardinalityIndices(col)

Arguments

Returned value

Example

Query:

DROP TABLE IF EXISTS test;
CREATE TABLE test (s LowCardinality(String)) ENGINE = Memory;

-- create two parts:

INSERT INTO test VALUES ('ab'), ('cd'), ('ab'), ('ab'), ('df');
INSERT INTO test VALUES ('ef'), ('cd'), ('ab'), ('cd'), ('ef');

SELECT s, lowCardinalityKeys(s) FROM test;

Result:

   ┌─s──┬─lowCardinalityKeys(s)─┐
1. │ ef │                       │
2. │ cd │ ef                    │
3. │ ab │ cd                    │
4. │ cd │ ab                    │
5. │ ef │                       │
   └────┴───────────────────────┘
    ┌─s──┬─lowCardinalityKeys(s)─┐
 6. │ ab │                       │
 7. │ cd │ ab                    │
 8. │ ab │ cd                    │
 9. │ ab │ df                    │
10. │ df │                       │
    └────┴───────────────────────┘

displayName

Returns the value of display_name from config or server Fully Qualified Domain Name (FQDN) if not set.

Syntax

displayName()

Returned value

  • Value of display_name from config or server FQDN if not set. String.

Example

The display_name can be set in config.xml. Taking for example a server with display_name configured to ‘production’:

<!-- It is the name that will be shown in the clickhouse-client.
     By default, anything with "production" will be highlighted in red in query prompt.
-->
<display_name>production</display_name>

Query:

SELECT displayName();

Result:

┌─displayName()─┐
│ production    │
└───────────────┘

transactionID

Returns the ID of a transaction.

This function is part of an experimental feature set. Enable experimental transaction support by adding this setting to your configuration:

<clickhouse>
  <allow_experimental_transactions>1</allow_experimental_transactions>
</clickhouse>

For more information see the page Transactional (ACID) support.


Syntax

transactionID()

Returned value

  • Returns a tuple consisting of start_csn, local_tid and host_id. Tuple.
  • start_csn: Global sequential number, the newest commit timestamp that was seen when this transaction began. UInt64.
  • local_tid: Local sequential number that is unique for each transaction started by this host within a specific start_csn. UInt64.
  • host_id: UUID of the host that has started this transaction. UUID.

Example

Query:

BEGIN TRANSACTION;
SELECT transactionID();
ROLLBACK;

Result:

┌─transactionID()────────────────────────────────┐
│ (32,34,'0ee8b069-f2bb-4748-9eae-069c85b5252b') │
└────────────────────────────────────────────────┘

transactionLatestSnapshot

Returns the newest snapshot (Commit Sequence Number) of a transaction that is available for reading.

This function is part of an experimental feature set. Enable experimental transaction support by adding this setting to your configuration:

<clickhouse>
  <allow_experimental_transactions>1</allow_experimental_transactions>
</clickhouse>

For more information see the page Transactional (ACID) support.


Syntax

transactionLatestSnapshot()

Returned value

  • Returns the latest snapshot (CSN) of a transaction. UInt64

Example

Query:

BEGIN TRANSACTION;
SELECT transactionLatestSnapshot();
ROLLBACK;

Result:

┌─transactionLatestSnapshot()─┐
│                          32 │
└─────────────────────────────┘

transactionOldestSnapshot

Returns the oldest snapshot (Commit Sequence Number) that is visible for some running transaction.

This function is part of an experimental feature set. Enable experimental transaction support by adding this setting to your configuration:

<clickhouse>
  <allow_experimental_transactions>1</allow_experimental_transactions>
</clickhouse>

For more information see the page Transactional (ACID) support.


Syntax

transactionOldestSnapshot()

Returned value

  • Returns the oldest snapshot (CSN) of a transaction. UInt64

Example

Query:

BEGIN TRANSACTION;
SELECT transactionLatestSnapshot();
ROLLBACK;

Result:

┌─transactionOldestSnapshot()─┐
│                          32 │
└─────────────────────────────┘

getSubcolumn

Takes a table expression or identifier and constant string with the name of the sub-column, and returns the requested sub-column extracted from the expression.

Syntax

getSubcolumn(col_name, subcol_name)

Arguments

Returned value

  • Returns the extracted sub-column.

Example

Query:

CREATE TABLE t_arr (arr Array(Tuple(subcolumn1 UInt32, subcolumn2 String))) ENGINE = MergeTree ORDER BY tuple();
INSERT INTO t_arr VALUES ([(1, 'Hello'), (2, 'World')]), ([(3, 'This'), (4, 'is'), (5, 'subcolumn')]);
SELECT getSubcolumn(arr, 'subcolumn1'), getSubcolumn(arr, 'subcolumn2') FROM t_arr;

Result:

   ┌─getSubcolumn(arr, 'subcolumn1')─┬─getSubcolumn(arr, 'subcolumn2')─┐
1. │ [1,2]                           │ ['Hello','World']               │
2. │ [3,4,5]                         │ ['This','is','subcolumn']       │
   └─────────────────────────────────┴─────────────────────────────────┘

getTypeSerializationStreams

Enumerates stream paths of a data type.

This function is intended for use by developers.


Syntax

getTypeSerializationStreams(col)

Arguments

  • col — Column or string representation of a data-type from which the data type will be detected.

Returned value

  • Returns an array with all the serialization sub-stream paths.Array(String).

Examples

Query:

SELECT getTypeSerializationStreams(tuple('a', 1, 'b', 2));

Result:

   ┌─getTypeSerializationStreams(('a', 1, 'b', 2))─────────────────────────────────────────────────────────────────────────┐
1. │ ['{TupleElement(1), Regular}','{TupleElement(2), Regular}','{TupleElement(3), Regular}','{TupleElement(4), Regular}'] │
   └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Query:

SELECT getTypeSerializationStreams('Map(String, Int64)');

Result:

   ┌─getTypeSerializationStreams('Map(String, Int64)')────────────────────────────────────────────────────────────────┐
1. │ ['{ArraySizes}','{ArrayElements, TupleElement(keys), Regular}','{ArrayElements, TupleElement(values), Regular}'] │
   └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘