Installing the command-line executable

Assuming you have Rust/Cargo installed, run this command in a terminal:

cargo install k8sql

It will make the k8sql command available in your PATH if you've allowed the PATH to be modified when installing Rust. cargo uninstall k8sql uninstalls.

Back to the crate overview.

Readme

k8sql

Query Kubernetes clusters using SQL or PRQL. Powered by Apache DataFusion.

Features

Dual query languages — Write SQL or PRQL, auto-detected
Intuitive JSON paths — Use status.phase instead of status->>'phase'
Multi-cluster queries — Query across clusters with _cluster = '*'
Smart optimization — Filters pushed to K8s API (namespaces, labels, field selectors)
Multiple interfaces — Interactive REPL, batch mode, PostgreSQL wire protocol
Full CRD support — Automatically discovers Custom Resource Definitions

Installation

Quick Install (Linux/macOS)

curl -sSfL https://raw.githubusercontent.com/ndenev/k8sql/master/install.sh | sh

The script automatically detects your OS and architecture, downloads the latest release, and installs to /usr/local/bin or ~/.local/bin.

From Cargo

cargo install k8sql

Manual Download

Download pre-built binaries for Linux, macOS, and Windows from GitHub Releases.

Quick Start

# Interactive REPL
k8sql

# SQL query
k8sql -q "SELECT name, namespace, status.phase FROM pods"

# PRQL query (auto-detected)
k8sql -q "from pods | filter status.phase == 'Running' | select {name, namespace}"

# Multi-cluster query
k8sql -c "prod-*" -q "SELECT _cluster, name FROM pods WHERE status.phase = 'Failed'"

# Output as JSON (or YAML or CSV)
k8sql -q "SELECT * FROM deployments" -o json

SQL vs PRQL

k8sql supports both SQL and PRQL — just write your query and it's auto-detected:

Task	SQL	PRQL
Basic query	`SELECT name FROM pods LIMIT 5`	`from pods \| select {name} \| take 5`
Filter	`WHERE namespace = 'default'`	`filter namespace == "default"`
Sort descending	`ORDER BY created DESC`	`sort {-created}`
Aggregate	`COUNT(*) ... GROUP BY namespace`	`group namespace (aggregate {count this})`
JSON field	`status.phase` or `status->>'phase'`	`status.phase`
Regex match	`WHERE name ~ 'nginx.*'`	`filter name ~= "nginx.*"`
Null coalesce	`COALESCE(namespace, 'default')`	`namespace ?? "default"`

CLI Options

-q, --query <SQL>       Execute a SQL query directly
-c, --context <CTX>     Kubernetes context(s): name, comma-separated, or glob pattern
-o, --output <FMT>      Output format: table, json, csv, yaml (default: table)
-f, --file <PATH>       Execute queries from a file
    --no-headers        Omit column headers in output
    --refresh-crds      Force refresh of cached CRD schemas
-v, --verbose           Enable verbose logging

Subcommands

k8sql interactive       # Start REPL (default)
k8sql daemon -p 15432   # PostgreSQL wire protocol server

SQL Extensions

The `_cluster` Column

Every table includes _cluster as the first column, enabling cross-cluster queries:

SELECT * FROM pods WHERE _cluster = 'prod'           -- Specific cluster
SELECT * FROM pods WHERE _cluster IN ('prod', 'dev') -- Multiple clusters
SELECT * FROM pods WHERE _cluster = '*'              -- All clusters (parallel)

Intuitive JSON Access

Intuitive JSON path access vs the more verbose -> operators.

-- Intuitive syntax (auto-converted)
SELECT name, status.phase FROM pods
SELECT name, spec.containers[0].image FROM pods
SELECT * FROM pods WHERE labels.app = 'nginx'

-- Array expansion (one row per container)
SELECT name, spec.containers[].image FROM pods

This works in both SQL and PRQL - no special syntax needed.

Supported patterns:

Syntax	Meaning	Converted To
`status.phase`	Field access	`status->>'phase'`
`spec.selector.app`	Nested fields	`spec->'selector'->>'app'`
`spec.containers[0]`	Array index	`spec->'containers'->0`
`spec.containers[]`	Array expansion	`UNNEST(json_get_array(...))`

Advanced: PostgreSQL arrow operators (also supported):

SELECT name, status->>'phase' FROM pods
SELECT name, status->'nodeInfo'->>'kubeletVersion' FROM nodes

JSON functions for complex access:

SELECT json_get_str(spec, 'containers', 0, 'image') FROM pods
SELECT json_get_int(spec, 'replicas') FROM deployments

Available JSON functions: json_get_str, json_get_int, json_get_float, json_get_bool, json_get_json, json_get_array, json_length, json_keys.

Why PRQL?

PRQL (Pipelined Relational Query Language) offers a more readable alternative to SQL:

Readable pipelines — Data flows left-to-right: from pods | filter ... | select ...
Less repetition — No need to repeat columns in GROUP BY
Composable — Build complex queries incrementally
Same JSON paths — Use status.phase just like in SQL

PRQL-specific operators:

Operator	Meaning	Example
`~=`	Regex match	`filter name ~= "nginx.*"`
`??`	Null coalesce	`derive ns = namespace ?? "default"`
`f"..."`	String interpolation	`derive full = f"{namespace}/{name}"`
`-col`	Sort descending	`sort {-created}`

PRQL is auto-detected when queries start with from, let, or prql.

Query Optimization

k8sql pushes predicates to the Kubernetes API when possible:

Predicate	Optimization
`namespace = 'x'`	Namespaced API call
`labels->>'app' = 'nginx'`	Label selector (server-side)
`_cluster = 'prod'`	Only queries that cluster
`_cluster = '*'`	Parallel queries to all clusters

Use -v to see which filters are pushed down.

Table Schema

All resources share a consistent schema:

Column	Type	Description
`_cluster`	text	Kubernetes context name
`api_version`	text	API version (v1, apps/v1, etc.)
`kind`	text	Resource kind
`name`	text	Resource name
`namespace`	text	Namespace (null for cluster-scoped)
`uid`	text	Unique identifier
`created`	timestamp	Creation timestamp
`labels`	json	Labels (access with `labels->>'key'`)
`annotations`	json	Annotations
`spec`	json	Resource specification
`status`	json	Resource status

Special cases: ConfigMaps/Secrets have data instead of spec/status. Events have dedicated columns, etc. CRDs have their top level fields discovered.

REPL Commands

SHOW TABLES          List available tables
SHOW DATABASES       List kubectl contexts (* = active)
DESCRIBE <table>     Show table schema
USE <cluster>        Switch context(s) - supports globs: USE prod-*
\dt, \l, \d, \x, \q  Shortcuts

Context selection persists to ~/.k8sql/config.json and restores on next startup.

Daemon Mode

Run as a PostgreSQL-compatible server:

k8sql daemon --port 15432 --context "prod-*"
psql -h localhost -p 15432

Caching

k8sql caches CRD schemas for fast startup:

CRD schemas: Cached indefinitely (use --refresh-crds to force refresh)
Cluster CRD list: Checked hourly for new/removed CRDs
Cache location: ~/.k8sql/cache/

Example Queries

SQL

-- Pods with status (using intuitive dot notation)
SELECT name, namespace, status.phase FROM pods

-- Container images from pods
SELECT name, spec.containers[0].image FROM pods WHERE namespace = 'default'

-- Unhealthy deployments
SELECT name, spec.replicas as desired, status.readyReplicas as ready
FROM deployments
WHERE status.readyReplicas < spec.replicas

-- Cross-cluster pod count
SELECT _cluster, COUNT(*) FROM pods WHERE _cluster = '*' GROUP BY _cluster

-- Find pods by label (server-side filtering)
SELECT name FROM pods WHERE labels.app = 'nginx' AND namespace = 'default'

-- All container images (array expansion)
SELECT name, spec.containers[].image as container_image FROM pods

PRQL

# Pods with status
from pods | select {name, namespace, phase = status.phase}

# Filter by phase
from pods
filter status.phase == "Running"
select {name, namespace}
take 10

# Unhealthy deployments
from deployments
filter status.readyReplicas < spec.replicas
select {name, desired = spec.replicas, ready = status.readyReplicas}

# Pods by label
from pods
filter labels.app == "nginx" && namespace == "default"
select {name, namespace}

# Aggregation - pod count by namespace
from pods
group namespace (aggregate {count = count this})
sort {-count}

Built With

Apache DataFusion - SQL query engine
kube-rs - Kubernetes client for Rust

License

BSD-3-Clause