Relational Data - Data Types - Lowcardinality
Changes the internal representation of other data types to be dictionary-encoded.
Syntax
LowCardinality(data_type)
Parameters
data_type
— String, FixedString, Date, DateTime, and numbers excepting Decimal.LowCardinality
is not efficient for some data types, see the allow_suspicious_low_cardinality_types setting description.
Description
LowCardinality
is a superstructure that changes a data storage method and rules of data processing. ClickHouse applies dictionary coding to LowCardinality
-columns. Operating with dictionary encoded data significantly increases performance of SELECT queries for many applications.
The efficiency of using LowCardinality
data type depends on data diversity. If a dictionary contains less than 10,000 distinct values, then ClickHouse mostly shows higher efficiency of data reading and storing. If a dictionary contains more than 100,000 distinct values, then ClickHouse can perform worse in comparison with using ordinary data types.
Consider using LowCardinality
instead of Enum when working with strings. LowCardinality
provides more flexibility in use and often reveals the same or higher efficiency.
Example
Create a table with a LowCardinality
-column:
CREATE TABLE lc_t
(
`id` UInt16,
`strings` LowCardinality(String)
)
ENGINE = MergeTree()
ORDER BY id
Related Settings and Functions
Settings:
- low_cardinality_max_dictionary_size
- low_cardinality_use_single_dictionary_for_part
- low_cardinality_allow_in_native_format
- allow_suspicious_low_cardinality_types
- output_format_arrow_low_cardinality_as_dictionary
Functions:
Related content
- Blog: Optimizing ClickHouse with Schemas and Codecs
- Blog: Working with time series data in ClickHouse
- String Optimization (video presentation in Russian). Slides in English