0% found this document useful (0 votes)
31 views3 pages

Sairamkrish.medium.com-Installing Clickhouse on a Kubernetes Cluster

This document provides a guide for installing Clickhouse, a fast open-source database, on a Kubernetes cluster, emphasizing the need for operators and Zookeeper for managing stateful instances. It outlines the installation steps for Zookeeper and Clickhouse operators, along with a demo manifest for setting up a Clickhouse cluster. Additionally, it discusses session affinity for ensuring data consistency during operations and includes useful resources for further learning.

Uploaded by

nnaemeken
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views3 pages

Sairamkrish.medium.com-Installing Clickhouse on a Kubernetes Cluster

This document provides a guide for installing Clickhouse, a fast open-source database, on a Kubernetes cluster, emphasizing the need for operators and Zookeeper for managing stateful instances. It outlines the installation steps for Zookeeper and Clickhouse operators, along with a demo manifest for setting up a Clickhouse cluster. Additionally, it discusses session affinity for ensuring data consistency during operations and includes useful resources for further learning.

Uploaded by

nnaemeken
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Installing Clickhouse on a kubernetes cluster

sairamkrish.medium.com/installing-clickhouse-on-a-kubernetes-cluster-b2721c883e55

Sairam Krish 21 August 2023

Sairam Krish

Looking to hide highlights? You can now hide them from the “•••” menu.

Clickhouse is the fastest and most resource efficient open-source database for real-time
apps and analytics. Installing clickhouse on a kubernetes cluster is not straight forward at
the moment. This article should help to do this.

Kubernetes operators
Clickhouse installation on kubernetes may start simple and slowly it would get more &
more complex. It’s stateful in nature. Need to operator on cluster of nodes. Should be
resilient & fault tolerant from a kubernetes resource perspective. So from the beginning
we will use operators to install & manage clickhouse.

Clickhouse needs Zookeeper


ZooKeeper is a centralized service for maintaining configuration information, naming,
providing distributed synchronization, and providing group services. To run a clickhouse
cluster, we need a coordination system between cluster of clickhouse instances.

Zookeeper brings the guarante of linearizable writes, non-linearizable reads to clickhouse


cluster.

Zookeeper installation steps


pravega/zookeeper-operator — this operator seems like a good option at the time of
writing this article. But I faced issue with running `linux/amd64` while installing this
operator on Mac M1. They have not published docker image for arm64.

We could simplify zookeeper installation with bitnami’s zookeeper

helm install zookeeper \ -- zookeeper.enabled= \ -- persistence.size=100Mi \


-- replicaCount=1 \ oci://registry-1.docker.io/bitnamicharts/zookeeper

Install clickhouse operator

helm repo add clickhouse-operator https://docs.altinity.com/clickhouse-


operator/helm repo updatehelm install clickhouse-operator clickhouse-
operator/altinity-clickhouse-operator

1/3
Install clickhouse
Here is a demo clickhouse installation manifest that uses clickhouse operator. This will
spin up a cluster with 3 replicas with 3 shards.

View table size information


To know how much space each table is taking, we could use following query

select parts.*,
columns.compressed_size,
columns.uncompressed_size,
columns.compression_ratio,
columns.compression_percentage
from (
selecttable,
formatReadableSize(sum(data_uncompressed_bytes)) AS
uncompressed_size,
formatReadableSize(sum(data_compressed_bytes)) AS
compressed_size,
round(sum(data_compressed_bytes) /sum(data_uncompressed_bytes), 3)
AS compression_ratio,
round((100- (sum(data_compressed_bytes) *100)
/sum(data_uncompressed_bytes)), 3) AS compression_percentage

system.columns ) columns
( , () ,
(modification_time) latest_modification,
formatReadableSize((bytes)) disk_size,
formatReadableSize((primary_key_bytes_in_memory)) primary_keys_size,
(engine) engine, (bytes)
bytes_size system.parts active database, ) parts columns.table
parts.table parts.bytes_size ;

This is inspired from this public gist. It has other developers giving more improvements on
top of this.

Schema migration tool


Among different schema migration tools, I like dbmate for clickhouse. Let me update this
section, once I have a demo example.

Kubernetes session affinity


Clickhouse is an eventually consistent database. There are times when we like to execute
a data insert or update or delete and immediately lookup for updated data. This is natural
in a Transactional database like postgresql. But not a good usecase for OLAP database
like Clickhouse. If there are few usecases that need to have consistent data integrity, we
could use session affinity and redirect traffic from same ClientIP to same clickhouse pods.

2/3
Not just the clickhouse service but all the services that is involved in the end user
interaction should have sessionAffinity: ClientIP
Let’s say, if we have a service behind kong both service and kong service should be
configured to use sessionAffinity. If kong is behind an istio-ingress, we should
configure sessionAffinity to istio as well
Without configuring all the services, we may end up in sending all requests to same
clickhouse pod.
With end to end sessionAffinity, each end user will be using a different clickhouse till
service.spec.sessionAffinityConfig.clientIP.timeoutSeconds . Default is 3
hours (10800 seconds). We can change it based on our needs.

Useful videos

Useful links
from altinity

Good reads
From EBay —

3/3

You might also like