Relational Data - Data Types - Variant
This type represents a union of other data types. Type Variant(T1, T2, ..., TN)
means that each row of this type has a value of either type T1
or T2
or … or TN
or none of them (NULL
value).
The order of nested types doesn’t matter: Variant(T1, T2) = Variant(T2, T1). Nested types can be arbitrary types except Nullable(…), LowCardinality(Nullable(…)) and Variant(…) types.
It’s not recommended to use similar types as variants (for example different numeric types like Variant(UInt32, Int64)
or different date types like Variant(Date, DateTime)
), because working with values of such types can lead to ambiguity. By default, creating such Variant
type will lead to an exception, but can be enabled using setting allow_suspicious_variant_types
The Variant data type is an experimental feature. To use it, set allow_experimental_variant_type = 1
.
Creating Variant
Using Variant
type in table column definition:
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT v FROM test;
┌─v─────────────┐
│ ᴺᵁᴸᴸ │
│ 42 │
│ Hello, World! │
│ [1,2,3] │
└───────────────┘
Using CAST from ordinary columns:
SELECT toTypeName(variant) as type_name, 'Hello, World!'::Variant(UInt64, String, Array(UInt64)) as variant;
┌─type_name──────────────────────────────┬─variant───────┐
│ Variant(Array(UInt64), String, UInt64) │ Hello, World! │
└────────────────────────────────────────┴───────────────┘
Using functions if/multiIf
when arguments don’t have common type (setting use_variant_as_common_type
should be enabled for it):
SET use_variant_as_common_type = 1;
SELECT if(number % 2, number, range(number)) as variant FROM numbers(5);
┌─variant───┐
│ [] │
│ 1 │
│ [0,1] │
│ 3 │
│ [0,1,2,3] │
└───────────┘
SET use_variant_as_common_type = 1;
SELECT multiIf((number % 4) = 0, 42, (number % 4) = 1, [1, 2, 3], (number % 4) = 2, 'Hello, World!', NULL) AS variant FROM numbers(4);
┌─variant───────┐
│ 42 │
│ [1,2,3] │
│ Hello, World! │
│ ᴺᵁᴸᴸ │
└───────────────┘
Using functions ‘array/map’ if array elements/map values don’t have common type (setting use_variant_as_common_type
should be enabled for it):
SET use_variant_as_common_type = 1;
SELECT array(range(number), number, 'str_' || toString(number)) as array_of_variants FROM numbers(3);
┌─array_of_variants─┐
│ [[],0,'str_0'] │
│ [[0],1,'str_1'] │
│ [[0,1],2,'str_2'] │
└───────────────────┘
SET use_variant_as_common_type = 1;
SELECT map('a', range(number), 'b', number, 'c', 'str_' || toString(number)) as map_of_variants FROM numbers(3);
┌─map_of_variants───────────────┐
│ {'a':[],'b':0,'c':'str_0'} │
│ {'a':[0],'b':1,'c':'str_1'} │
│ {'a':[0,1],'b':2,'c':'str_2'} │
└───────────────────────────────┘
Reading Variant nested types as subcolumns
Variant type supports reading a single nested type from a Variant column using the type name as a subcolumn. So, if you have column variant Variant(T1, T2, T3)
you can read a subcolumn of type T2
using syntax variant.T2
, this subcolumn will have type Nullable(T2)
if T2
can be inside Nullable
and T2
otherwise. This subcolumn will be the same size as original Variant
column and will contain NULL
values (or empty values if T2
cannot be inside Nullable
) in all rows in which original Variant
column doesn’t have type T2
.
Variant subcolumns can be also read using function variantElement(variant_column, type_name)
.
Examples:
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT v, v.String, v.UInt64, v.`Array(UInt64)` FROM test;
┌─v─────────────┬─v.String──────┬─v.UInt64─┬─v.Array(UInt64)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42 │ ᴺᵁᴸᴸ │ 42 │ [] │
│ Hello, World! │ Hello, World! │ ᴺᵁᴸᴸ │ [] │
│ [1,2,3] │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │
└───────────────┴───────────────┴──────────┴─────────────────┘
SELECT toTypeName(v.String), toTypeName(v.UInt64), toTypeName(v.`Array(UInt64)`) FROM test LIMIT 1;
┌─toTypeName(v.String)─┬─toTypeName(v.UInt64)─┬─toTypeName(v.Array(UInt64))─┐
│ Nullable(String) │ Nullable(UInt64) │ Array(UInt64) │
└──────────────────────┴──────────────────────┴─────────────────────────────┘
SELECT v, variantElement(v, 'String'), variantElement(v, 'UInt64'), variantElement(v, 'Array(UInt64)') FROM test;
┌─v─────────────┬─variantElement(v, 'String')─┬─variantElement(v, 'UInt64')─┬─variantElement(v, 'Array(UInt64)')─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42 │ ᴺᵁᴸᴸ │ 42 │ [] │
│ Hello, World! │ Hello, World! │ ᴺᵁᴸᴸ │ [] │
│ [1,2,3] │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │
└───────────────┴─────────────────────────────┴─────────────────────────────┴────────────────────────────────────┘
To know what variant is stored in each row function variantType(variant_column)
can be used. It returns Enum
with variant type name for each row (or 'None'
if row is NULL
).
Example:
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT variantType(v) from test;
┌─variantType(v)─┐
│ None │
│ UInt64 │
│ String │
│ Array(UInt64) │
└────────────────┘
SELECT toTypeName(variantType(v)) FROM test LIMIT 1;
┌─toTypeName(variantType(v))──────────────────────────────────────────┐
│ Enum8('None' = -1, 'Array(UInt64)' = 0, 'String' = 1, 'UInt64' = 2) │
└─────────────────────────────────────────────────────────────────────┘
Conversion between a Variant column and other columns
There are 4 possible conversions that can be performed with a column of type Variant
.
Converting a String column to a Variant column
Conversion from String
to Variant
is performed by parsing a value of Variant
type from the string value:
SELECT '42'::Variant(String, UInt64) as variant, variantType(variant) as variant_type
┌─variant─┬─variant_type─┐
│ 42 │ UInt64 │
└─────────┴──────────────┘
SELECT '[1, 2, 3]'::Variant(String, Array(UInt64)) as variant, variantType(variant) as variant_type
┌─variant─┬─variant_type──┐
│ [1,2,3] │ Array(UInt64) │
└─────────┴───────────────┘
SELECT CAST(map('key1', '42', 'key2', 'true', 'key3', '2020-01-01'), 'Map(String, Variant(UInt64, Bool, Date))') as map_of_variants, mapApply((k, v) -> (k, variantType(v)), map_of_variants) as map_of_variant_types```
┌─map_of_variants─────────────────────────────┬─map_of_variant_types──────────────────────────┐
│ {'key1':42,'key2':true,'key3':'2020-01-01'} │ {'key1':'UInt64','key2':'Bool','key3':'Date'} │
└─────────────────────────────────────────────┴───────────────────────────────────────────────┘
Converting an ordinary column to a Variant column
It is possible to convert an ordinary column with type T
to a Variant
column containing this type:
SELECT toTypeName(variant) as type_name, [1,2,3]::Array(UInt64)::Variant(UInt64, String, Array(UInt64)) as variant, variantType(variant) as variant_name
┌─type_name──────────────────────────────┬─variant─┬─variant_name──┐
│ Variant(Array(UInt64), String, UInt64) │ [1,2,3] │ Array(UInt64) │
└────────────────────────────────────────┴─────────┴───────────────┘
Note: converting from String
type is always performed through parsing, if you need to convert String
column to String
variant of a Variant
without parsing, you can do the following:
SELECT '[1, 2, 3]'::Variant(String)::Variant(String, Array(UInt64), UInt64) as variant, variantType(variant) as variant_type
┌─variant───┬─variant_type─┐
│ [1, 2, 3] │ String │
└───────────┴──────────────┘
Converting a Variant column to an ordinary column
It is possible to convert a Variant
column to an ordinary column. In this case all nested variants will be converted to a destination type:
CREATE TABLE test (v Variant(UInt64, String)) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('42.42');
SELECT v::Nullable(Float64) FROM test;
┌─CAST(v, 'Nullable(Float64)')─┐
│ ᴺᵁᴸᴸ │
│ 42 │
│ 42.42 │
└──────────────────────────────┘
Converting a Variant to another Variant
It is possible to convert a Variant
column to another Variant
column, but only if the destination Variant
column contains all nested types from the original Variant
:
CREATE TABLE test (v Variant(UInt64, String)) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('String');
SELECT v::Variant(UInt64, String, Array(UInt64)) FROM test;
┌─CAST(v, 'Variant(UInt64, String, Array(UInt64))')─┐
│ ᴺᵁᴸᴸ │
│ 42 │
│ String │
└───────────────────────────────────────────────────┘
Reading Variant type from the data
All text formats (TSV, CSV, CustomSeparated, Values, JSONEachRow, etc) supports reading Variant
type. During data parsing ClickHouse tries to insert value into most appropriate variant type.
Example:
SELECT
v,
variantElement(v, 'String') AS str,
variantElement(v, 'UInt64') AS num,
variantElement(v, 'Float64') AS float,
variantElement(v, 'DateTime') AS date,
variantElement(v, 'Array(UInt64)') AS arr
FROM format(JSONEachRow, 'v Variant(String, UInt64, Float64, DateTime, Array(UInt64))', $$
{"v" : "Hello, World!"},
{"v" : 42},
{"v" : 42.42},
{"v" : "2020-01-01 00:00:00"},
{"v" : [1, 2, 3]}
$$)
┌─v───────────────────┬─str───────────┬──num─┬─float─┬────────────────date─┬─arr─────┐
│ Hello, World! │ Hello, World! │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42 │ ᴺᵁᴸᴸ │ 42 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42.42 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ 42.42 │ ᴺᵁᴸᴸ │ [] │
│ 2020-01-01 00:00:00 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ 2020-01-01 00:00:00 │ [] │
│ [1,2,3] │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │
└─────────────────────┴───────────────┴──────┴───────┴─────────────────────┴─────────┘
Comparing values of Variant type
Values of a Variant
type can be compared only with values with the same Variant
type.
The result of operator <
for values v1
with underlying type T1
and v2
with underlying type T2
of a type Variant(..., T1, ... T2, ...)
is defined as follows: - If T1 = T2 = T
, the result will be v1.T < v2.T
(underlying values will be compared). - If T1 != T2
, the result will be T1 < T2
(type names will be compared).
Examples:
CREATE TABLE test (v1 Variant(String, UInt64, Array(UInt32)), v2 Variant(String, UInt64, Array(UInt32))) ENGINE=Memory;
INSERT INTO test VALUES (42, 42), (42, 43), (42, 'abc'), (42, [1, 2, 3]), (42, []), (42, NULL);
SELECT v2, variantType(v2) as v2_type from test order by v2;
┌─v2──────┬─v2_type───────┐
│ [] │ Array(UInt32) │
│ [1,2,3] │ Array(UInt32) │
│ abc │ String │
│ 42 │ UInt64 │
│ 43 │ UInt64 │
│ ᴺᵁᴸᴸ │ None │
└─────────┴───────────────┘
SELECT v1, variantType(v1) as v1_type, v2, variantType(v2) as v2_type, v1 = v2, v1 < v2, v1 > v2 from test;
┌─v1─┬─v1_type─┬─v2──────┬─v2_type───────┬─equals(v1, v2)─┬─less(v1, v2)─┬─greater(v1, v2)─┐
│ 42 │ UInt64 │ 42 │ UInt64 │ 1 │ 0 │ 0 │
│ 42 │ UInt64 │ 43 │ UInt64 │ 0 │ 1 │ 0 │
│ 42 │ UInt64 │ abc │ String │ 0 │ 0 │ 1 │
│ 42 │ UInt64 │ [1,2,3] │ Array(UInt32) │ 0 │ 0 │ 1 │
│ 42 │ UInt64 │ [] │ Array(UInt32) │ 0 │ 0 │ 1 │
│ 42 │ UInt64 │ ᴺᵁᴸᴸ │ None │ 0 │ 1 │ 0 │
└────┴─────────┴─────────┴───────────────┴────────────────┴──────────────┴─────────────────┘
If you need to find the row with specific Variant
value, you can do one of the following:
- Cast value to the corresponding
Variant
type:
SELECT * FROM test WHERE v2 == [1,2,3]::Array(UInt32)::Variant(String, UInt64, Array(UInt32));
┌─v1─┬─v2──────┐
│ 42 │ [1,2,3] │
└────┴─────────┘
- Compare
Variant
subcolumn with required type:
SELECT * FROM test WHERE v2.`Array(UInt32)` == [1,2,3] -- or using variantElement(v2, 'Array(UInt32)')
┌─v1─┬─v2──────┐
│ 42 │ [1,2,3] │
└────┴─────────┘
Sometimes it can be useful to make additional check on variant type as subcolumns with complex types like Array/Map/Tuple
cannot be inside Nullable
and will have default values instead of NULL
on rows with different types:
SELECT v2, v2.`Array(UInt32)`, variantType(v2) FROM test WHERE v2.`Array(UInt32)` == [];
┌─v2───┬─v2.Array(UInt32)─┬─variantType(v2)─┐
│ 42 │ [] │ UInt64 │
│ 43 │ [] │ UInt64 │
│ abc │ [] │ String │
│ [] │ [] │ Array(UInt32) │
│ ᴺᵁᴸᴸ │ [] │ None │
└──────┴──────────────────┴─────────────────┘
SELECT v2, v2.`Array(UInt32)`, variantType(v2) FROM test WHERE variantType(v2) == 'Array(UInt32)' AND v2.`Array(UInt32)` == [];
┌─v2─┬─v2.Array(UInt32)─┬─variantType(v2)─┐
│ [] │ [] │ Array(UInt32) │
└────┴──────────────────┴─────────────────┘
Note: values of variants with different numeric types are considered as different variants and not compared between each other, their type names are compared instead.
Example:
SET allow_suspicious_variant_types = 1;
CREATE TABLE test (v Variant(UInt32, Int64)) ENGINE=Memory;
INSERT INTO test VALUES (1::UInt32), (1::Int64), (100::UInt32), (100::Int64);
SELECT v, variantType(v) FROM test ORDER by v;
┌─v───┬─variantType(v)─┐
│ 1 │ Int64 │
│ 100 │ Int64 │
│ 1 │ UInt32 │
│ 100 │ UInt32 │
└─────┴────────────────┘
JSONExtract functions with Variant
All JSONExtract*
functions support Variant
type:
SELECT JSONExtract('{"a" : [1, 2, 3]}', 'a', 'Variant(UInt32, String, Array(UInt32))') AS variant, variantType(variant) AS variant_type;
┌─variant─┬─variant_type──┐
│ [1,2,3] │ Array(UInt32) │
└─────────┴───────────────┘
SELECT JSONExtract('{"obj" : {"a" : 42, "b" : "Hello", "c" : [1,2,3]}}', 'obj', 'Map(String, Variant(UInt32, String, Array(UInt32)))') AS map_of_variants, mapApply((k, v) -> (k, variantType(v)), map_of_variants) AS map_of_variant_types
┌─map_of_variants──────────────────┬─map_of_variant_types────────────────────────────┐
│ {'a':42,'b':'Hello','c':[1,2,3]} │ {'a':'UInt32','b':'String','c':'Array(UInt32)'} │
└──────────────────────────────────┴─────────────────────────────────────────────────┘
SELECT JSONExtractKeysAndValues('{"a" : 42, "b" : "Hello", "c" : [1,2,3]}', 'Variant(UInt32, String, Array(UInt32))') AS variants, arrayMap(x -> (x.1, variantType(x.2)), variants) AS variant_types
┌─variants───────────────────────────────┬─variant_types─────────────────────────────────────────┐
│ [('a',42),('b','Hello'),('c',[1,2,3])] │ [('a','UInt32'),('b','String'),('c','Array(UInt32)')] │
└────────────────────────────────────────┴───────────────────────────────────────────────────────┘