Installing K3s and the First Pod

K3s installed in one command. Everything after that took longer. This post covers enabling cgroups on Raspberry Pi OS, fixing kubectl permissions properly via a systemd service override, and watching the pod scheduling pipeline run live in your own cluster — including what QoS classes actually mean when you change them in real time.

Previous post: From microSD to SSD Boot on Raspberry Pi 4

What this covers

With the Pi running cleanly from the Intenso SSD, the next goal; get a real Kubernetes cluster running and understand what's actually happening when a pod starts.

By the end of this post you'll have seen:

K3s installed and running on ARM hardware
The full pod scheduling pipeline live in your own cluster
Resource requests and limits in practice
QoS classes changing in real time based on what you define

Why K3s and not full Kubernetes

K3s is a lightweight Kubernetes distribution built specifically for edge devices, ARM hardware, and resource-constrained environments. It exposes the full Kubernetes API — every kubectl command you learn here works identically against a production EKS or GKE cluster — but it runs in a fraction of the memory.

Full Kubernetes on a Raspberry Pi 4 would consume most of the available RAM before you deployed a single workload. K3s runs comfortably alongside a full application stack on the same 4GB.

Source: https://docs.k3s.io/architecture

Step 1 — Enable cgroups

This is the step that catches almost everyone on Raspberry Pi OS. Kubernetes needs Linux cgroups (control groups) to enforce container resource limits — CPU throttling, memory limits, scheduling decisions. On Pi OS they're not fully enabled by default.

Edit /boot/firmware/cmdline.txt:

sudo nano /boot/firmware/cmdline.txt

Add these parameters at the end of the single line:

cgroup_memory=1 cgroup_enable=memory cgroup_enable=cpuset

The full line now looks like:

console=serial0,115200 console=tty1 root=PARTUUID=6692b3d6-02 rootfstype=ext4 fsck.repair=yes rootwait usb-storage.quirks=152d:0579:u cgroup_memory=1 cgroup_enable=memory cgroup_enable=cpuset

Verify with cat -A — one line, one $ at the end. Then reboot.

After reboot, verify cgroups are active:

cat /sys/fs/cgroup/cgroup.controllers

Output:

cpuset cpu io memory pids

memory in that list confirms the memory controller is active.. I am on cgroup v2 -which modern Raspberry Pi OS uses it by default. The legacy cgroup_memory=1 syntax is harmless but the actual enablement comes from cgroup v2's memory controller being present.

Source: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Source: https://docs.k3s.io/installation/requirements?os=pi#operating-systems

Step 2 — Install K3s

One command:

curl -sfL https://get.k3s.io | sh -

This downloads the K3s binary, installs it as a systemd service, generates a kubeconfig, and starts the cluster. Takes about 2 minutes on the Pi.

Verify the cluster started:

sudo systemctl status k3s | head -5

● k3s.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s.service; enabled)
     Active: active (running)

Source of truth: https://docs.k3s.io/quick-start

Step 3 — The kubectl permissions problem (and how it was fixed)

This is where it got interesting.

K3s writes its kubeconfig to /etc/rancher/k3s/k3s.yaml with root-only permissions (600). Running kubectl get nodes without sudo fails:

yvette@newerkey-lab:~ $ kubectl get nodes
WARN[0000] Unable to read /etc/rancher/k3s/k3s.yaml, please start server
with --write-kubeconfig-mode or --write-kubeconfig-group to modify kube
config permissions
error: error loading config file "/etc/rancher/k3s/k3s.yaml": open
/etc/rancher/k3s/k3s.yaml: permission denied

The documented fix is to create /etc/rancher/k3s/config.yaml with:

write-kubeconfig-mode: "0644"

This tells K3s to write the kubeconfig with readable permissions on every start. I verified the file was correct — cat -A showed clean YAML, xxd showed no hidden characters — but K3s kept ignoring it.

After tdebugging for what felt like forever, I took a more explicit approach: a systemd service override.

sudo mkdir -p /etc/systemd/system/k3s.service.d
sudo nano /etc/systemd/system/k3s.service.d/override.conf

Content:

[Service]
ExecStart=
ExecStart=/usr/local/bin/k3s server --write-kubeconfig-mode=0644

The first blank ExecStart= clears the existing command before setting the new one — required systemd syntax for overrides.

sudo systemctl daemon-reload
sudo systemctl restart k3s

Verify the permissions changed:

ls -la /etc/rancher/k3s/k3s.yaml

-rw-r--r-- 1 root root 2941 Jun  5 16:45 /etc/rancher/k3s/k3s.yaml

-rw-r--r-- is 644. It means that the owner can read and write to the file, while everyone else (groups and others) can only read**.** Now kubectl works without sudo and survives every restart and upgrade.

yvette@newerkey-lab:~ $ kubectl get nodes
NAME           STATUS   ROLES           AGE     VERSION
newerkey-lab   Ready    control-plane   3d18h   v1.35.5+k3s1

Why I decided on the systemd override as my preferred long-term fix:

Survives K3s upgrades (stored separately from the K3s binary)
Survives restarts (applied every time the service starts)
Explicit — you can see exactly what flag is being passed
Standard Linux pattern

Source: https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html

Step 4 — What K3s installed automatically

yvette@newerkey-lab:~ $ kubectl get pods --all-namespaces
NAMESPACE     NAME                                      READY   STATUS      RESTARTS       AGE
kube-system   coredns-8db54c48d-q6wzr                   1/1     Running     4 (147m ago)   3d21h
kube-system   helm-install-traefik-crd-2xxbj            0/1     Completed   0              3d21h
kube-system   helm-install-traefik-wf99m                0/1     Completed   2              3d21h
kube-system   local-path-provisioner-5d9d9885bc-bnddm   1/1     Running     4 (147m ago)   3d21h
kube-system   metrics-server-786d997795-twsvr           1/1     Running     4 (147m ago)   3d21h
kube-system   svclb-traefik-0ad420ed-zcg6n              2/2     Running     8 (147m ago)   3d21h
kube-system   traefik-9bcdbbd9-drlcw                    1/1     Running     4 (147m ago)   3d21h

What each component does:

coredns — Cluster DNS. Every service gets a DNS name automatically. Pods resolve http://my-service to the right cluster IP without any manual configuration.

helm-install-traefik (Completed) — Finished Jobs that ran once to install Traefik via Helm and exited. Completed status is expected — they're not consuming resources.

local-path-provisioner — Creates Persistent volumes on the node's local disk. Needed from Day 3 when the private registry needs persistent storage.

metrics-server — Collects CPU and memory usage from all pods and nodes. Powers kubectl top and the Horizontal Pod Autoscaler.

svclb-traefik — K3s's built-in service load balancer. Routes external traffic into the cluster.

traefik — The ingress controller. Routes HTTP/HTTPS requests to the right service based on hostname and path rules. Day 4 builds on this directly.

Step 5 — The pod scheduling pipeline

Deploy the first pod:

kubectl run hello-nginx --image=nginx:alpine
kubectl get pods --watch

Once Running, describe it:

yvette@newerkey-lab:~ $ kubectl describe pod hello-nginx

The Events section at the bottom shows the full scheduling pipeline:

Normal  Scheduled  27m   default-scheduler  Successfully assigned default/hello-nginx to newerkey-lab
Normal  Pulling    27m   kubelet            Pulling image "nginx:alpine"
Normal  Pulled     27m   kubelet            Successfully pulled image "nginx:alpine" in 4.797s
Normal  Created    27m   kubelet            Container created
Normal  Started    27m   kubelet            Container started

Step by step:

kubectl run sends a pod spec to the API server
API server writes it to etcd (cluster state store)
The scheduler watches for unscheduled pods and assigns this one to newerkey-lab
The kubelet on that node sees the assignment, pulls the image from the registry
The container runtime (containerd) creates and starts the container
Once the container passes its readiness check, the pod moves to Running

Also note: QoS Class: BestEffort at the bottom of the describe output. No resource requests or limits were set, so Kubernetes assigned the lowest quality of service class. These pods are first to be evicted under memory pressure.

Source: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/

Step 6 — The QoS class change

This is the thing that was insightful for me. I deleted the bare pod and redeployed it with resource requests and limits defined in a YAML manifest:

apiVersion: v1
kind: Pod
metadata:
  name: hello-nginx
  namespace: default
spec:
  containers:
  - name: hello-nginx
    image: nginx:alpine
    resources:
      requests:
        memory: "64Mi"
        cpu: "100m"
      limits:
        memory: "128Mi"
        cpu: "200m"

kubectl apply -f hello-nginx.yaml
kubectl describe pod hello-nginx

Two things changed in the describe output:

Requests and limits now appear:

Limits:
  cpu:     200m
  memory:  128Mi
Requests:
  cpu:        100m
  memory:     64Mi

QoS Class changed:

QoS Class: Burstable

Reading the documentation told me this would happen. Seeing it change in my own cluster output made it real.

The three QoS classes:

Class	When assigned	Eviction priority
`Guaranteed`	requests == limits for all containers	Last evicted
`Burstable`	requests set but lower than limits	Middle
`BestEffort`	no requests or limits	First evicted

This matters for production platform work. A pod with no resource limits in a shared cluster is a neighbour problem — it can consume unbounded resources and starve other workloads. Setting limits is not optional on a platform that other teams depend on.

Source: https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/

Source : https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

Step 7 — Checking actual resource usage

yvette@newerkey-lab:~ $ kubectl top pod hello-nginx
NAME          CPU(cores)   MEMORY(bytes)
hello-nginx   0m           4Mi

yvette@newerkey-lab:~ $ kubectl top node
NAME           CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)
newerkey-lab   129m         3%       1331Mi          35%

The pod is using 4Mi against a 128Mi limit — well within bounds. If it hit 128Mi, the Linux kernel OOM killer would terminate the container process and Kubernetes would restart it. Repeated OOMKills produce CrashLoopBackOff.

💡

Mi - Mebibyte

The node is using 1331Mi of 4GB — 35%. That leaves roughly 2.4GB free for the rest of the work in this series.

Source: https://kubernetes.io/docs/reference/kubectl/generated/kubectl\_top/

The four kubectl commands to know

These are the common commands I used:

# List resources
kubectl get pods
kubectl get pods --all-namespaces
kubectl get pods --watch

# Detailed info — use this first when something is wrong
kubectl describe pod hello-nginx

# Application logs
kubectl logs hello-nginx

# Shell into a running container
kubectl exec -it hello-nginx -- /bin/sh

kubectl describe is the most important debugging command. The Events section shows exactly what happened at each stage — whether the pod failed to schedule, whether the image pull failed, whether the container crashed on start.

Source: https://kubernetes.io/docs/reference/kubectl/quick-reference/

What's running now

newerkey-lab (Raspberry Pi 4, 4GB)
└── K3s v1.35.5+k3s1
    ├── kube-system: coredns, traefik, metrics-server, local-path-provisioner
    └── default: hello-nginx (nginx:alpine, Burstable QoS, 4Mi/128Mi memory)

Node utilisation: 129m CPU (3%), 1331Mi memory (35%)
kubectl: working without sudo, permanent via systemd override

What I learned today

Kubernetes concepts land differently hands-on but still a lot to take in so don't rush to understand all at once. QoS classes, pod scheduling pipeline, resource requests — I'd read about all of these. Watching the QoS class change from BestEffort to Burstable in my own describe output, against a pod I just deployed, made the concept stick in a way documentation alone doesn't.

Debugging is normal. The kubectl permissions issue took longer than the K3s install itself. The solution — a systemd service override — is more robust than the documented config.yaml approach. Sometimes the detour teaches you more than the happy path would have.

Image caching is visible. First deployment: Pulling image "nginx:alpine" in 4.797s. Second deployment: Container image "nginx:alpine" already present on machine. That's the image cache working — and the reason CronJob pods sometimes start slowly on nodes that haven't cached the image yet.

What's next

Private Docker Registry. Deploy a private container registry inside the K3s cluster with persistent storage, configure K3s to trust it, and push the first image from the laptop into the cluster.

Installing K3s and the First Pod

What this covers

Why K3s and not full Kubernetes

Step 1 — Enable cgroups

Step 2 — Install K3s

Step 3 — The kubectl permissions problem (and how it was fixed)

Step 4 — What K3s installed automatically

Step 5 — The pod scheduling pipeline

Step 6 — The QoS class change

Step 7 — Checking actual resource usage

The four kubectl commands to know

What's running now

What I learned today

What's next

References

Comments

What's in My Air

From microSD to SSD Boot on Raspberry Pi 4

More from this blog

From microSD to SSD Boot on Raspberry Pi 4

Set up Budgets and Budget Alarms

Create and Manage an IAM user(AWS)

Tips I learned to solve problems in Software Engineering

Command Palette

What this covers

Why K3s and not full Kubernetes

Step 1 — Enable cgroups

Step 2 — Install K3s

Step 3 — The kubectl permissions problem (and how it was fixed)

Step 4 — What K3s installed automatically

Step 5 — The pod scheduling pipeline

Step 6 — The QoS class change

Step 7 — Checking actual resource usage

The four kubectl commands to know

What's running now

What I learned today

What's next

References

Comments

What's in My Air

From microSD to SSD Boot on Raspberry Pi 4

More from this blog