Sairamkrish.medium.com-Installing Clickhouse on a Kubernetes Cluster
Sairamkrish.medium.com-Installing Clickhouse on a Kubernetes Cluster
sairamkrish.medium.com/installing-clickhouse-on-a-kubernetes-cluster-b2721c883e55
Sairam Krish
Looking to hide highlights? You can now hide them from the “•••” menu.
Clickhouse is the fastest and most resource efficient open-source database for real-time
apps and analytics. Installing clickhouse on a kubernetes cluster is not straight forward at
the moment. This article should help to do this.
Kubernetes operators
Clickhouse installation on kubernetes may start simple and slowly it would get more &
more complex. It’s stateful in nature. Need to operator on cluster of nodes. Should be
resilient & fault tolerant from a kubernetes resource perspective. So from the beginning
we will use operators to install & manage clickhouse.
1/3
Install clickhouse
Here is a demo clickhouse installation manifest that uses clickhouse operator. This will
spin up a cluster with 3 replicas with 3 shards.
select parts.*,
columns.compressed_size,
columns.uncompressed_size,
columns.compression_ratio,
columns.compression_percentage
from (
selecttable,
formatReadableSize(sum(data_uncompressed_bytes)) AS
uncompressed_size,
formatReadableSize(sum(data_compressed_bytes)) AS
compressed_size,
round(sum(data_compressed_bytes) /sum(data_uncompressed_bytes), 3)
AS compression_ratio,
round((100- (sum(data_compressed_bytes) *100)
/sum(data_uncompressed_bytes)), 3) AS compression_percentage
system.columns ) columns
( , () ,
(modification_time) latest_modification,
formatReadableSize((bytes)) disk_size,
formatReadableSize((primary_key_bytes_in_memory)) primary_keys_size,
(engine) engine, (bytes)
bytes_size system.parts active database, ) parts columns.table
parts.table parts.bytes_size ;
This is inspired from this public gist. It has other developers giving more improvements on
top of this.
2/3
Not just the clickhouse service but all the services that is involved in the end user
interaction should have sessionAffinity: ClientIP
Let’s say, if we have a service behind kong both service and kong service should be
configured to use sessionAffinity. If kong is behind an istio-ingress, we should
configure sessionAffinity to istio as well
Without configuring all the services, we may end up in sending all requests to same
clickhouse pod.
With end to end sessionAffinity, each end user will be using a different clickhouse till
service.spec.sessionAffinityConfig.clientIP.timeoutSeconds . Default is 3
hours (10800 seconds). We can change it based on our needs.
Useful videos
Useful links
from altinity
Good reads
From EBay —
3/3