Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container Operations
Operations
Oleg Chunikhin | CTO, Kublr
Introductions
Oleg Chunikhin
CTO, Kublr
ü Twitter @olgch
History
• Custom software development company
• Dozens of projects per year
• Varying target environments: clouds, on-prem,
hybrid
• Recurring need for unified application delivery
and ops platform w/ monitoring, logs, security,
multiple env, ...
@olgch; @kublr
Docker and Kubernetes to the Rescue
• Docker is great, but local
• Kubernetes is great... when it is up and running
• Who sets up and operates K8S clusters?
• Who takes care of operational aspects at scale?
• How do you provide governance and ensure
compliance?
@olgch; @kublr
Enterprise Kubernetes Needs
Developers SRE/Ops/DevOps/SecOps
• Self-service • Org multi-tenancy
• Compatible • Single pane of glass
• Conformant • Operations
• Configurable • Monitoring
• Open & Flexible • Log collection
• Image management
• Security
• Identity management
• Reliability
• Performance
• Portability
@olgch; @kublr
Kubernetes Management Platform Wanted
• Portability – clouds, on-prem, hybrid, air-gapped, different OS’
• Centralized multi-cluster operations saves resources – many
environments (dev, prod, QA, ...), teams, applications
• Self-service and governance for Kubernetes operations
• Reliability – cluster self-healing, self-reliance
• Limited management profile – cloud and K8S API
• Architecture – flexible, open, pluggable, compatible
• Sturdy – secure, scalable, modular, HA, DR etc.
@olgch; @kublr
OPERATIONS SECURITY &
GOVERNANCE
Usage
Infrastructure
API
Reporting
@olgch; @kublr
Central Control Plane: Operations
K8S Clusters
Cloud
DevAPI
@olgch; @kublr
Central Control Plane: Operations
@olgch; @kublr
Cluster: Self-Sufficiency Simple
orchestration and
Infrastructure configuration agent
Orchestration
Automation Store Secrets
discovery
Central
NODE control
MASTER plane
Docker Docker
overlay network, discovery, overlay network, discovery,
connectivity connectivity
@olgch; @kublr
Cluster: Portability
• (Almost) everything runs in containers
• Simple (single-binary) management agent Infrastructure Orchestration
Automation Store Secrets
• Minimal store requirements discovery
Infrastructure and
• Enable access to the store K8s Master Components:
etcd, scheduler, API, Application containers
Ansible, ...
• Load balancer is not required for multi-master;
each agent can independently fail over to a healthy
master @olgch; @kublr
Cluster: Reliability
• Rely on underlying platform as much as
possible
Infrastructure
• ASG on AWS Automation
Orchestration
Store
@olgch; @kublr
Central Control Plane: Logs and Metrics
K8S Clusters
Cloud
DevAPI
@olgch; @kublr
Centralized Monitoring and Log Collection.
Why Bother?
• Prometheus and ELK are heavy and not easy to operate;
need attention and at least 4-8 Gb RAM... each, per cluster
• Cloud/SaaS monitoring is not always permitted or available
• Existing monitoring is often not container-aware
• No aggregated view and analysis
• No alerting governance
@olgch; @kublr
K8S Monitoring with Prometheus
@olgch; @kublr
Centralized Monitoring
Control plane
Cluster registry
Configurator
Prometheus
config
KUBERNETES CLUSTER
K8S Proxy API
Grafana PROMETHEUS Prometheus
nodes, pods, (collector)
service endpoints
Ship externally
Prometheus Ship externally
data
@olgch; @kublr
Centralized Monitoring: Considerations
• Prometheus resource usage tuning
• Long-term storage (m3)
• Configuration file growth with many clusters
• Metrics labeling
• Additional load on API server
@olgch; @kublr
Centralized Monitoring
@olgch; @kublr
K8S Logging with Elasticsearch
Kubernetes Cluster
• Fluentd runs on nodes
• OS, K8S, and container logs Kibana
collected and shipped to
Elasticsearch Elasticsearch
@olgch; @kublr
Centralized Log Collection
Control plane
Cluster registry
Configurator
Messaging
config
filter KUBERNETES CLUSTER
K8S Proxy API
RabbitMQ MQTT RabbitMQ
Prometheus
Shovel Forwarder (collector)
Port Fluentd filter
forwarding
filter analyze MQTT
Ship externally Ship externally
Logstash Elasticsearch
@olgch; @kublr
Centralized Log Collection: Considerations
• Tune Elasticsearch resource usage
• Take into account additional load on API server
• Log index structure normalization
{ {
"data": { "flatData": [
"elasticsearch": { {
"version": "6.x" "key": "elasticsearch.version",
} "type": "string",
} "key_type": "elasticsearch.version.string",
} "value_string": "6.x"
},
...
]
}
@olgch; @kublr
Take Kublr for a test drive!
kublr.com/deploy
Q&A
Free non-production license.
@olgch; @kublr
Oleg Chunikhin
Chief Technology Officer
[email protected]
@olgch