Using Steampipe to SQL Query Kubernetes
2024-09-03
version 0.1, 2024-09-03
Steampipe is a tool that allows you to use SQL to query multiple API endpoints.
Over time I have built several ad-hoc and not so ad-hoc tools that do something similar but unlike Steampipe they had a very limited scope and required me to rebuild the DB (SQLite) every time I wanted to re-hydrate the data.
This post covers the following:
-
A trip down some of my ad-hoc solutions.
-
Installing Steampipe and the Kubernetes plugin.
-
Some useful functions.
-
Some useful queries.
Ad-hoc Solution Basics
Lets say I need to get a list of pods and see in which nodes they are running. With kubectl there is no direct way to do this, you need to do two queries and then join the results.
I have seen people do it with jq or some other tool, but I normally default to SQLite (that is, until I found Steampipe).
So I would do something like this:
Define what fields I want to get from the pods and the nodes and make them top level as a TSV:
$ kubectl get pods -A -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.namespace}{"\t"}{.spec.nodeName}{"\n"}{end}' > pods.tsv $ kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kops\.k8s\.io/instancegroup}{"\t"}{.spec.taints}{"\t"}{.metadata.labels.beta\.kubernetes\.io/instance-type}{"\t"}{.metadata.labels.topology\.kubernetes\.io/zone}{"\t"}{.metadata.creationTimestamp}{"\n"}{end}' > nodes.tsv
Then create a simple schema based on those fields:
CREATE TABLE IF NOT EXISTS "pods" (
"Name" TEXT,
"Namespace" TEXT,
"NodeName" TEXT
);
CREATE TABLE IF NOT EXISTS "nodes" (
"Name" TEXT,
"InstanceGroup" TEXT,
"Taints" TEXT,
"InstanceType" TEXT,
"Zone" TEXT,
"CreationTimestamp" TEXT
);
.mode tabs
.import pods.tsv pods
.import nodes.tsv nodes
.mode table
As you can see, the schema file also imports the data into the tables.
Create the DB:
$ sqlite3 k8s.db < db.schema
Then run your query:
$ sqlite3 k8s.db sqlite> SELECT pods.*,nodes.instancetype FROM pods LEFT JOIN nodes ON pods.nodename = nodes.name WHERE namespace = 'default'; +-------------+-----------+---------------------------------------------+--------------+ | Name | Namespace | NodeName | InstanceType | +-------------+-----------+---------------------------------------------+--------------+ | app-a-kpns4 | default | ip-xxx-yy-42-42.us-west-2.compute.internal | m5.8xlarge | +-------------+-----------+---------------------------------------------+--------------+ | app-a-bz7rv | default | ip-xxx-yy-39-148.us-west-2.compute.internal | m5.8xlarge | +-------------+-----------+---------------------------------------------+--------------+ | app-a-jr7rd | default | ip-xxx-yy-45-31.us-west-2.compute.internal | m5.8xlarge | +-------------+-----------+---------------------------------------------+--------------+
That is a very simple example, but you can see the amount of effort required in maintaining and preparing the data. Every time you need another field, or another label you need to update the schema and the data.
Steampipe Installation
Installation is described here: https://steampipe.io/downloads
And the Kubernetes plugin installation is described here: https://hub.steampipe.io/plugins/turbot/kubernetes
Finally, setup your config file ~/.steampipe/config/kubernetes.spc
:
# Default connection that points to my current context
connection "k" {
plugin = "kubernetes"
custom_resource_tables = ["*"]
source_types = ["deployed"]
}
# Specific connections to a single context
connection "k_dev" {
plugin = "kubernetes"
config_path = "~/.kube/config"
config_context = "dev.example.com"
custom_resource_tables = ["*"]
source_types = ["deployed"]
}
connection "k_prod" {
plugin = "kubernetes"
config_path = "~/.kube/config"
config_context = "prod.example.com"
custom_resource_tables = ["*"]
source_types = ["deployed"]
}
Important
|
The default connection |
After setting up your config you can run Steampipe from the CLI or start the service and connect to it via a PostgreSQL client.
Now you can run the same query as before, without any prep:
Run: steampipe query
Then enter your query:
SELECT
pod.name,
pod.namespace,
node.name AS nodename,
node.labels ->> 'beta.kubernetes.io/instance-type' AS instancetype
FROM
k.kubernetes_pod AS pod
LEFT JOIN k.kubernetes_node AS node ON pod.node_name = node.name
WHERE
pod.namespace = 'default';
And just like that, without any prep, you get the same result as before. The only difference is that instead of having to surface the labels as a top level column, I had to access them as a JSON object.
If you need the cache to refresh, run:
SELECT FROM steampipe_internal.meta_cache('clear');
And as mentioned before, if you are changing contexts and are using the default connection, you need to restart the service.
CPU and Memory Functions
Currently (2024-09-03) Steampipe does not have built in functions to normalize CPU and Memory values so that one can do any math on them. I created a Github Issue to try and figure out how to add them.
I would like, for example, to see the total CPU and Memory used by a deployment, or the percentage of utilization a node has.
The CPU and Memory requests and limits are defined as Quantity and can have the following formats:
-
base 10 units:
m
| ”“ |k
|M
|G
|T
|P
|E
-
base 2 units:
Ki
|Mi
|Gi
|Ti
|Pi
|Ei
Where m
fractions get rounded up.
CPU units are easier to normalize, because at least in the clusters I work with we only have m
(milicores) and ”“ (cores) units so I chose to normalize to m
.
CREATE OR REPLACE FUNCTION cpu_m (cpu text)
RETURNS bigint
LANGUAGE plpgsql
AS $$
DECLARE
v bigint;
BEGIN
IF cpu like '%m' THEN
-- https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/
-- m fractions get rounded up
SELECT
ceiling(trim(trailing 'm' from cpu)::numeric) INTO v;
ELSE
SELECT
cpu::numeric * 1000 INTO v;
END IF;
RETURN v;
END;
$$;
And to validate it works, run:
SELECT
cpu_m ('31750m') = 31750,
cpu_m ('32') = 32000,
cpu_m ('1.5') = 1500,
cpu_m ('1.5m') = 2;
For memory, the units are a bit more complex, but I chose to normalize to bytes since requesting a fraction of a byte makes no sense.
However, after testing it in my clusters I did find instances where developers had requested memory in m
units.
This is probably a mistake or just bad practice IMHO, so in the future I might add a Kyverno policy to prevent this.
CREATE OR REPLACE FUNCTION memory_bytes (memory text)
RETURNS bigint
LANGUAGE plpgsql
AS $$
DECLARE
v bigint;
BEGIN
-- https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/
-- base 10: m | "" | k | M | G | T | P | E
-- base 2: Ki | Mi | Gi | Ti | Pi | Ei
-- m fractions get rounded up
CASE --
WHEN memory LIKE '%m' THEN
SELECT
ceiling(trim(TRAILING 'm' FROM memory)::numeric / 1000) INTO v;
WHEN memory LIKE '%k' THEN
SELECT
trim(TRAILING 'k' FROM memory)::numeric * 1000 INTO v;
WHEN memory LIKE '%M' THEN
SELECT
trim(TRAILING 'M' FROM memory)::numeric * 1000_000 INTO v;
WHEN memory LIKE '%G' THEN
SELECT
trim(TRAILING 'G' FROM memory)::numeric * 1000_000_000 INTO v;
WHEN memory LIKE '%T' THEN
SELECT
trim(TRAILING 'T' FROM memory)::numeric * 1000_000_000_000 INTO v;
WHEN memory LIKE '%P' THEN
SELECT
trim(TRAILING 'P' FROM memory)::numeric * 1000_000_000_000_000 INTO v;
WHEN memory LIKE '%E' THEN
SELECT
trim(TRAILING 'E' FROM memory)::numeric * 1000_000_000_000_000 INTO v;
WHEN memory LIKE '%Ki' THEN
SELECT
trim(TRAILING 'Ki' FROM memory)::numeric * 1024 INTO v;
WHEN memory LIKE '%Mi' THEN
SELECT
trim(TRAILING 'Mi' FROM memory)::numeric * 1024 * 1024 INTO v;
WHEN memory LIKE '%Gi' THEN
SELECT
trim(TRAILING 'Gi' FROM memory)::numeric * 1024 * 1024 * 1024 INTO v;
WHEN memory LIKE '%Ti' THEN
SELECT
trim(TRAILING 'Ti' FROM memory)::numeric * 1024 * 1024 * 1024 * 1024 INTO v;
WHEN memory LIKE '%Pi' THEN
SELECT
trim(TRAILING 'Pi' FROM memory)::numeric * 1024 * 1024 * 1024 * 1024 * 1024 INTO v;
WHEN memory LIKE '%Ei' THEN
SELECT
trim(TRAILING 'Ei' FROM memory)::numeric * 1024 * 1024 * 1024 * 1024 * 1024 * 1024 INTO v;
--
--
ELSE
SELECT
memory::bigint INTO v;
END CASE;
RETURN v;
END;
$$;
And to validate it works, run:
SELECT
memory_bytes ('3m') = 1 as subm,
memory_bytes ('3000m') = 3 as m,
memory_bytes ('31750') = 31750 AS empty,
memory_bytes ('32k') = 32000 AS k,
memory_bytes ('32Ki') = 32768 AS Ki,
memory_bytes ('10Mi') = 10485760 AS Mi,
memory_bytes ('1.5Gi') = 1610612736 AS Gi,
memory_bytes ('1.5Mi') = 1572864 as fraction_Mi;
Once the math is done, I want to be able to see the memory in something other than bytes so I created an approximation function to display the data:
CREATE OR REPLACE FUNCTION memory_aprox_quantity (bytes numeric)
RETURNS text
LANGUAGE plpgsql
AS $$
DECLARE
v text;
BEGIN
-- https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/
-- base 10: m | "" | k | M | G | T | P | E
-- base 2: Ki | Mi | Gi | Ti | Pi | Ei
CASE --
WHEN bytes >= (1024::numeric * 1024 * 1024 * 1024 * 1024 * 1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / (1024::numeric * 1024 * 1024 * 1024 * 1024 * 1024), 'FM9999999999999999999.9')), 'Ei') INTO v;
WHEN bytes >= (1024::numeric * 1024 * 1024 * 1024 * 1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / (1024::numeric * 1024 * 1024 * 1024 * 1024), 'FM9999999999999999999.9')), 'Pi') INTO v;
WHEN bytes >= (1024::numeric * 1024 * 1024 * 1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / (1024::numeric * 1024 * 1024 * 1024), 'FM9999999999999999999.9')), 'Ti') INTO v;
WHEN bytes >= (1024 * 1024 * 1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / (1024 * 1024 * 1024), 'FM9999999999999999999.9')), 'Gi') INTO v;
WHEN bytes >= (1024 * 1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / (1024 * 1024), 'FM9999999999999999999.9')), 'Mi') INTO v;
WHEN bytes >= (1024) THEN
SELECT
concat(trim(TRAILING '.' FROM to_char(bytes / 1024, 'FM9999999999999999999.9')), 'Ki') INTO v;
--
--
ELSE
SELECT
bytes INTO v;
END CASE;
RETURN v;
END;
$$;
This one is not exact as I am only showing one decimal place.
The Queries In Action
Get all deployments, with their pods (through a replicaset) and the nodes they are running on.
Count how many of those are still running on m5.8xlarge
nodes and how many m5.8xlarge
nodes are running them.
To verify the outcome of rolling this deployment, check the anti affinity rules that are in place to check if the pods are being spread across the hosts and zones or if they can be scheduled on the same node.
SELECT
d.name AS deployment_name,
d.namespace,
d.replicas,
count(pod.name) AS pod_count,
cpu_m (pcrequests.* ->> 'cpu') AS pod_cpu,
memory_aprox_quantity (memory_bytes (pcrequests.* ->> 'memory')) AS pod_memory,
count(node.name) AS node_count,
jsonb_agg(pod.name) AS pod_name,
json_agg(DISTINCT (coalesce(requiredPodAntiAffinity, '[]') || coalesce(preferredPodAntiAffinity, '[]'))) AS podAntiAffinity,
jsonb_agg(node.name) AS node_name
FROM
k.kubernetes_deployment AS d
LEFT OUTER JOIN LATERAL jsonb_path_query(d.template, '$.spec.containers[*].resources.requests') pcrequests ON TRUE
JOIN k.kubernetes_replicaset AS rs ON rs.namespace = d.namespace
AND rs.owner_references -> 0 ->> 'uid' = d.uid
JOIN k.kubernetes_pod AS pod ON pod.namespace = rs.namespace
AND rs.uid = pod.owner_references -> 0 ->> 'uid'
LEFT OUTER JOIN k.kubernetes_node AS node ON node.name = pod.node_name
LEFT OUTER JOIN LATERAL jsonb_path_query(d.template, '$.spec.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].topologyKey') requiredPodAntiAffinity ON TRUE
LEFT OUTER JOIN LATERAL jsonb_path_query(d.template, '$.spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey') preferredPodAntiAffinity ON TRUE
WHERE
node.labels ->> 'beta.kubernetes.io/instance-type' = 'm5.8xlarge'
GROUP BY
d.name,
d.namespace,
d.replicas,
rs.name,
pod_cpu,
pod_memory
ORDER BY
d.replicas DESC,
node_count DESC
LIMIT 1;
+-----------------+-------------+----------+-----------+---------+------------+------------+----------+------------------------------+-----------+ | deployment_name | namespace | replicas | pod_count | pod_cpu | pod_memory | node_count | pod_name | podantiaffinity | node_name | +-----------------+-------------+----------+-----------+---------+------------+------------+----------+------------------------------+-----------+ | coredns | kube-system | 89 | 78 | 100 | 70Mi | 78 | ... | [["kubernetes.io/hostname"]] | ... | +-----------------+-------------+----------+-----------+---------+------------+------------+----------+------------------------------+-----------+
Same as above but for statefulsets.
SELECT
sts.name AS sts_name,
sts.namespace,
sts.replicas,
count(pod.name) AS pod_count,
cpu_m (pcrequests.* ->> 'cpu') AS pod_cpu,
memory_aprox_quantity (memory_bytes (pcrequests.* ->> 'memory')) AS pod_memory,
count(node.name) AS node_count,
jsonb_agg(pod.name) AS pod_name,
json_agg(DISTINCT (coalesce(requiredPodAntiAffinity, '[]') || coalesce(preferredPodAntiAffinity, '[]'))) AS podAntiAffinity,
jsonb_agg(node.name) AS node_name
FROM
k.kubernetes_stateful_set AS sts
LEFT OUTER JOIN LATERAL jsonb_path_query(sts.template, '$.spec.containers[*].resources.requests') pcrequests ON TRUE
JOIN k.kubernetes_pod AS pod ON pod.namespace = sts.namespace
AND sts.uid = pod.owner_references -> 0 ->> 'uid'
LEFT OUTER JOIN k.kubernetes_node AS node ON node.name = pod.node_name
LEFT OUTER JOIN LATERAL jsonb_path_query(sts.template, '$.spec.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].topologyKey') requiredPodAntiAffinity ON TRUE
LEFT OUTER JOIN LATERAL jsonb_path_query(sts.template, '$.spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey') preferredPodAntiAffinity ON TRUE
WHERE
node.labels ->> 'beta.kubernetes.io/instance-type' = 'm5.8xlarge'
GROUP BY
sts.name,
sts.namespace,
sts.replicas,
pod_cpu,
pod_memory
ORDER BY
sts.replicas DESC,
node_count DESC
LIMIT 1;
SELECT all nodes and list the pods they contain as well as the percentage of CPU and Memory usage, WHERE their instance type is m5.8xlarge. Optionally limit the pods to a single namespace.
SELECT
node.name,
node.labels ->> 'beta.kubernetes.io/instance-type' AS instancetype,
node.labels ->> 'kops.k8s.io/instancegroup' AS instancegroup,
count(pod.name) AS pod_count,
-- json_agg(pod.name) AS pods,
cpu_m (node.allocatable ->> 'cpu') AS node_cpu,
memory_aprox_quantity (memory_bytes (node.allocatable ->> 'memory')) AS node_memory,
sum(cpu_m (pcrequests.* ->> 'cpu')) AS pod_cpu,
to_char(sum(cpu_m (pcrequests.* ->> 'cpu')) * 100 / cpu_m (node.allocatable ->> 'cpu'), 'FM999.99') AS "pod_cpu_%",
memory_aprox_quantity (sum(memory_bytes (pcrequests.* ->> 'memory'))) AS pod_memory,
to_char(sum(memory_bytes (pcrequests.* ->> 'memory')) * 100 / memory_bytes (node.allocatable ->> 'memory'), 'FM999.99') AS "pod_memory_%"
FROM
k.kubernetes_node AS node
LEFT OUTER JOIN k.kubernetes_pod AS pod ON node.name = pod.node_name
CROSS JOIN LATERAL jsonb_path_query(pod.containers, '$[*].resources.requests') pcrequests
WHERE
--pod.namespace = 'default'
--AND
node.labels ->> 'beta.kubernetes.io/instance-type' = 'm5.8xlarge'
GROUP BY
node.name,
node.labels ->> 'beta.kubernetes.io/instance-type',
node.labels ->> 'kops.k8s.io/instancegroup',
node.allocatable ->> 'cpu',
node.allocatable ->> 'memory'
ORDER BY
instancetype,
pod_count DESC
LIMIT 1;
+---------------------------------------------+--------------+------------------+-----------+----------+-------------+---------+-----------+------------+--------------+ | name | instancetype | instancegroup | pod_count | node_cpu | node_memory | pod_cpu | pod_cpu_% | pod_memory | pod_memory_% | +---------------------------------------------+--------------+------------------+-----------+----------+-------------+---------+-----------+------------+--------------+ | ip-xx-yy-39-148.us-west-2.compute.internal | m5.8xlarge | nodes-us-west-2c | 25 | 31750 | 119Gi | 31060 | 97.83 | 75.3Gi | 63.21 | +---------------------------------------------+--------------+------------------+-----------+----------+-------------+---------+-----------+------------+--------------+
Conclusion
Once you get the groove of Steampipe you start asking more and more questions in a pretty intuitive way. SQL has been my tool of choice to query AWS and Kubernetes for a long time and I finally found a tool that does most of the work for me.
Getting around accessing the JSON objects is the hardest part and something I really hope they improve upon in the future by exposing some well known fields as top level columns.