Efficient Point Lookups in ClickHouse with ULIDs and Min-Max Indexes

Introduction

When storing event or log data in ClickHouse, you often need to query specific event IDs. However, since the ORDER BY or PRIMARY KEY clauses in these tables usually prioritize timestamps over event IDs, searching for an event ID without a secondary index can lead to a full table scan—a costly operation.

In this post, we'll explore two efficient methods to perform point lookups in ClickHouse:

Using a Bloom filter index with regular non-sortable IDs.
Leveraging min-max indexes with ULIDs (Universally Unique Lexicographically Sortable Identifiers).

What are Bloom Filters?

Bloom filters are probabilistic data structures used to test whether an element is in a set. They are fast and space-efficient but can produce false positives.

How They Work

Initialization: Start with a bit array of size m set to 0 and use k independent hash functions.
Insertion: To add an element, hash it with each of the k hash functions and set the corresponding bits to 1.
Querying: To check for an element, hash it with the same k functions and verify that all corresponding bits are 1. If any bit is 0, the element is not in the set. If all are 1, the element is likely in the set.

You can control the false positive rate by adjusting the size of the bit array (m) and the number of hash functions (k).

Bloom Filters in ClickHouse

If your event IDs are unsortable unique identifiers and not in ORDER BY/PRIMARY KEY clause, a Bloom filter index is the only way to optimize point lookups in ClickHouse.

You can create a Bloom filter index like this:

-- Syntax
ALTER TABLE <table> ADD INDEX <index_name> <column> TYPE bloom_filter(<error_rate>) GRANULARITY <granularity>;

-- Example
ALTER TABLE events ADD INDEX bloom_filter_id_idx event_id TYPE bloom_filter(0.01) GRANULARITY 1;

The GRANULARITY parameter is crucial. To understand it, let's revisit ClickHouse's index_granularity.

ClickHouse's primary index is a sparse index, storing keys for every n-th row, where n is defined by index_granularity during table creation. A group of n rows is called a granule. By default, index_granularity is set to 8192.

Here's an example table:

CREATE TABLE events (
    event_id UUID,
    event_time DateTime,
    INDEX bloom_filter_id_idx event_id TYPE bloom_filter(0.01) GRANULARITY 1
) ENGINE = MergeTree()
ORDER BY event_time
SETTINGS index_granularity = 8192;

In this context, a GRANULARITY of 1 for the Bloom filter index means there's one Bloom filter per granule. If you set GRANULARITY to 2, there would be one Bloom filter for every two granules (i.e., every index_granularity * 2 rows).

What does this mean practically?

Lower GRANULARITY: More Bloom filters are created, which can increase storage and processing overhead but reduce the number of rows scanned during a lookup.
Higher GRANULARITY: Fewer Bloom filters are stored, reducing overhead but potentially increasing the number of rows scanned.

Adjust the GRANULARITY based on your needs, never assume anything, always test on your own dataset. Don't forget that, there is a cost to indexing during ingestion, so you should consider that as well.

Introducing ULIDs

ULIDs (Universally Unique Lexicographically Sortable Identifiers) are 128-bit values represented as 26-character strings. The first 10 characters encode a timestamp with millisecond precision, and the remaining 16 characters are random.

What makes ULIDs special is their lexicographical sortability based on time. This means if you sort ULIDs, they will be ordered chronologically. For event or log data, this is incredibly useful because:

Events are inherently time-based.
ULIDs combine the uniqueness of UUIDs with time-based ordering.
Storing ULIDs allows your table to be mostly sorted by event IDs, aligning with the timestamp order.

Leveraging ULIDs in ClickHouse enables us to use min-max indexes effectively for point lookups.

Min-Max Indexes in ClickHouse

A min-max index in ClickHouse stores the minimum and maximum values of a specified column for each granule or group of granules (again, GRANULARITY applies here as well). This index type is highly efficient for range queries and can be leveraged for point lookups when the data is sorted or partially sorted.

Since ULIDs are lexicographically sortable by time, using a min-max index on the event_id column (which contains ULIDs) can significantly speed up point lookups.

You can create a min-max index like this:

-- Syntax
ALTER TABLE <table> ADD INDEX <index_name> <column> TYPE minmax GRANULARITY <granularity>;

-- Example
ALTER TABLE events ADD INDEX minmax_id_idx event_id TYPE minmax GRANULARITY 1;

Comparing Bloom Filter and Min-Max Index Sizes

Now let's do a simple benchmark to compare the selectivity of the indexes and their on-disk sizes.

Setup

-- We'll create two tables, one with a bloom filter index and one with a minmax index
CREATE TABLE events_bloom (
    event_id String,
    event_time DateTime64,
    INDEX bloom_filter_id_idx event_id TYPE bloom_filter(0.01) GRANULARITY 1
) ENGINE = MergeTree()
ORDER BY event_time
SETTINGS index_granularity = 8192;

CREATE TABLE events_min_max (
    event_id String,
    event_time DateTime64,
    INDEX minmax_id_idx event_id TYPE minmax GRANULARITY 1
) ENGINE = MergeTree()
ORDER BY event_time
SETTINGS index_granularity = 8192;

-- Insert ulid/timestamp data into the tables both tables, about 1.05 million rows
INSERT INTO events_bloom SELECT `id`, `@timestamp` FROM ecs_local_ulid;
INSERT INTO events_min_max SELECT `id`, `@timestamp` FROM ecs_local_ulid;

-- Optimize the tables to ensure background merges don't interfere with the benchmark
OPTIMIZE TABLE events_bloom FINAL;
OPTIMIZE TABLE events_min_max FINAL;

Let's see how selective the indexes are with a random event ids

-- minmax index only needed to check 2 granules
EXPLAIN indexes = 1
SELECT *
FROM events_min_max
WHERE event_id IN ('01J9ZN765G81PYCA0DH6YS3G0Z', '01J8PP9KEGWVSKEY92YJYBFJ4T')

Query id: 0e3d74a7-0c7b-4663-93d2-70752766ab66

    ┌─explain────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))      │
 2. │   Expression                                   │
 3. │     ReadFromMergeTree (default.events_min_max) │
 4. │     Indexes:                                   │
 5. │       PrimaryKey                               │
 6. │         Condition: true                        │
 7. │         Parts: 1/1                             │
 8. │         Granules: 129/129                      │
 9. │       Skip                                     │
10. │         Name: minmax_id_idx                    │
11. │         Description: minmax GRANULARITY 1      │
12. │         Parts: 1/1                             │
13. │         Granules: 2/129                        │
    └────────────────────────────────────────────────┘

13 rows in set. Elapsed: 0.002 sec.


-- bloom filter index needed to check 3 granules!!
EXPLAIN indexes = 1
SELECT *
FROM events_bloom
WHERE event_id IN ('01J9ZN765G81PYCA0DH6YS3G0Z', '01J8PP9KEGWVSKEY92YJYBFJ4T')

Query id: ada3192c-2ef6-427b-8a5c-1a478a5d275e

    ┌─explain─────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))       │
 2. │   Expression                                    │
 3. │     ReadFromMergeTree (default.events_bloom)    │
 4. │     Indexes:                                    │
 5. │       PrimaryKey                                │
 6. │         Condition: true                         │
 7. │         Parts: 1/1                              │
 8. │         Granules: 129/129                       │
 9. │       Skip                                      │
10. │         Name: bloom_filter_id_idx               │
11. │         Description: bloom_filter GRANULARITY 1 │
12. │         Parts: 1/1                              │
13. │         Granules: 3/129                         │
    └─────────────────────────────────────────────────┘

13 rows in set. Elapsed: 0.004 sec.

Comparing the index sizes on disk

SELECT
    formatReadableSize(data_compressed_bytes),
    name,
    type_full,
    granularity
FROM system.data_skipping_indices
WHERE expr = 'event_id'

Query id: 45268f3d-ed52-4bc6-8497-4c47d02aa25e

   ┌─formatReadableSize(data_compressed_bytes)─┬─name────────────────┬─type_full──────────┬─granularity─┐
1. │ 1.26 MiB                                  │ bloom_filter_id_idx │ bloom_filter(0.01) │           1 │
2. │ 5.98 KiB                                  │ minmax_id_idx       │ minmax             │           1 │
   └───────────────────────────────────────────┴─────────────────────┴────────────────────┴─────────────┘

2 rows in set. Elapsed: 0.002 sec.

In this unscientific benchmark, the min-max index uses approximately 215 times less space than the Bloom filter index and is more selective. Which is a big deal, I/O is the most expensive operation in a database, so the less data you need to read/write the better.

Conclusion

So, yeah ULIDs or really any timesortable unique ids are useful for point-look up usescases in Clickhouse. Outside of this use-case, ULIDs are great for cursor pagination as well, but that's a topic for another day.