Installing K3s and the First Pod

K3s installed in one command. Everything after that took longer. This post covers enabling cgroups on Raspberry Pi OS, fixing kubectl permissions properly via a systemd service override, and watching the pod scheduling pipeline run live in your own cluster — including what QoS classes actually mean when you change them in real time.
Previous post: From microSD to SSD Boot on Raspberry Pi 4
What this covers
With the Pi running cleanly from the Intenso SSD, the next goal; get a real Kubernetes cluster running and understand what's actually happening when a pod starts.
By the end of this post you'll have seen:
K3s installed and running on ARM hardware
The full pod scheduling pipeline live in your own cluster
Resource requests and limits in practice
QoS classes changing in real time based on what you define
Why K3s and not full Kubernetes
K3s is a lightweight Kubernetes distribution built specifically for edge devices, ARM hardware, and resource-constrained environments. It exposes the full Kubernetes API — every kubectl command you learn here works identically against a production EKS or GKE cluster — but it runs in a fraction of the memory.
Full Kubernetes on a Raspberry Pi 4 would consume most of the available RAM before you deployed a single workload. K3s runs comfortably alongside a full application stack on the same 4GB.
Source: https://docs.k3s.io/architecture
Step 1 — Enable cgroups
This is the step that catches almost everyone on Raspberry Pi OS. Kubernetes needs Linux cgroups (control groups) to enforce container resource limits — CPU throttling, memory limits, scheduling decisions. On Pi OS they're not fully enabled by default.
Edit /boot/firmware/cmdline.txt:
sudo nano /boot/firmware/cmdline.txt
Add these parameters at the end of the single line:
cgroup_memory=1 cgroup_enable=memory cgroup_enable=cpuset
The full line now looks like:
console=serial0,115200 console=tty1 root=PARTUUID=6692b3d6-02 rootfstype=ext4 fsck.repair=yes rootwait usb-storage.quirks=152d:0579:u cgroup_memory=1 cgroup_enable=memory cgroup_enable=cpuset
Verify with cat -A — one line, one $ at the end. Then reboot.
After reboot, verify cgroups are active:
cat /sys/fs/cgroup/cgroup.controllers
Output:
cpuset cpu io memory pids
memory in that list confirms the memory controller is active.. I am on cgroup v2 -which modern Raspberry Pi OS uses it by default. The legacy cgroup_memory=1 syntax is harmless but the actual enablement comes from cgroup v2's memory controller being present.
Source: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
Source: https://docs.k3s.io/installation/requirements?os=pi#operating-systems
Step 2 — Install K3s
One command:
curl -sfL https://get.k3s.io | sh -
This downloads the K3s binary, installs it as a systemd service, generates a kubeconfig, and starts the cluster. Takes about 2 minutes on the Pi.
Verify the cluster started:
sudo systemctl status k3s | head -5
● k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled)
Active: active (running)
Source of truth: https://docs.k3s.io/quick-start
Step 3 — The kubectl permissions problem (and how it was fixed)
This is where it got interesting.
K3s writes its kubeconfig to /etc/rancher/k3s/k3s.yaml with root-only permissions (600). Running kubectl get nodes without sudo fails:
yvette@newerkey-lab:~ $ kubectl get nodes
WARN[0000] Unable to read /etc/rancher/k3s/k3s.yaml, please start server
with --write-kubeconfig-mode or --write-kubeconfig-group to modify kube
config permissions
error: error loading config file "/etc/rancher/k3s/k3s.yaml": open
/etc/rancher/k3s/k3s.yaml: permission denied
The documented fix is to create /etc/rancher/k3s/config.yaml with:
write-kubeconfig-mode: "0644"
This tells K3s to write the kubeconfig with readable permissions on every start. I verified the file was correct — cat -A showed clean YAML, xxd showed no hidden characters — but K3s kept ignoring it.
After tdebugging for what felt like forever, I took a more explicit approach: a systemd service override.
sudo mkdir -p /etc/systemd/system/k3s.service.d
sudo nano /etc/systemd/system/k3s.service.d/override.conf
Content:
[Service]
ExecStart=
ExecStart=/usr/local/bin/k3s server --write-kubeconfig-mode=0644
The first blank ExecStart= clears the existing command before setting the new one — required systemd syntax for overrides.
sudo systemctl daemon-reload
sudo systemctl restart k3s
Verify the permissions changed:
ls -la /etc/rancher/k3s/k3s.yaml
-rw-r--r-- 1 root root 2941 Jun 5 16:45 /etc/rancher/k3s/k3s.yaml
-rw-r--r-- is 644. It means that the owner can read and write to the file, while everyone else (groups and others) can only read**.** Now kubectl works without sudo and survives every restart and upgrade.
yvette@newerkey-lab:~ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
newerkey-lab Ready control-plane 3d18h v1.35.5+k3s1
Why I decided on the systemd override as my preferred long-term fix:
Survives K3s upgrades (stored separately from the K3s binary)
Survives restarts (applied every time the service starts)
Explicit — you can see exactly what flag is being passed
Standard Linux pattern
Source: https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html
Step 4 — What K3s installed automatically
yvette@newerkey-lab:~ $ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-8db54c48d-q6wzr 1/1 Running 4 (147m ago) 3d21h
kube-system helm-install-traefik-crd-2xxbj 0/1 Completed 0 3d21h
kube-system helm-install-traefik-wf99m 0/1 Completed 2 3d21h
kube-system local-path-provisioner-5d9d9885bc-bnddm 1/1 Running 4 (147m ago) 3d21h
kube-system metrics-server-786d997795-twsvr 1/1 Running 4 (147m ago) 3d21h
kube-system svclb-traefik-0ad420ed-zcg6n 2/2 Running 8 (147m ago) 3d21h
kube-system traefik-9bcdbbd9-drlcw 1/1 Running 4 (147m ago) 3d21h
What each component does:
coredns — Cluster DNS. Every service gets a DNS name automatically. Pods resolve http://my-service to the right cluster IP without any manual configuration.
helm-install-traefik (Completed) — Finished Jobs that ran once to install Traefik via Helm and exited. Completed status is expected — they're not consuming resources.
local-path-provisioner — Creates Persistent volumes on the node's local disk. Needed from Day 3 when the private registry needs persistent storage.
metrics-server — Collects CPU and memory usage from all pods and nodes. Powers kubectl top and the Horizontal Pod Autoscaler.
svclb-traefik — K3s's built-in service load balancer. Routes external traffic into the cluster.
traefik — The ingress controller. Routes HTTP/HTTPS requests to the right service based on hostname and path rules. Day 4 builds on this directly.
Step 5 — The pod scheduling pipeline
Deploy the first pod:
kubectl run hello-nginx --image=nginx:alpine
kubectl get pods --watch
Once Running, describe it:
yvette@newerkey-lab:~ $ kubectl describe pod hello-nginx
The Events section at the bottom shows the full scheduling pipeline:
Normal Scheduled 27m default-scheduler Successfully assigned default/hello-nginx to newerkey-lab
Normal Pulling 27m kubelet Pulling image "nginx:alpine"
Normal Pulled 27m kubelet Successfully pulled image "nginx:alpine" in 4.797s
Normal Created 27m kubelet Container created
Normal Started 27m kubelet Container started
Step by step:
kubectl runsends a pod spec to the API serverAPI server writes it to etcd (cluster state store)
The scheduler watches for unscheduled pods and assigns this one to
newerkey-labThe kubelet on that node sees the assignment, pulls the image from the registry
The container runtime (containerd) creates and starts the container
Once the container passes its readiness check, the pod moves to
Running
Also note: QoS Class: BestEffort at the bottom of the describe output. No resource requests or limits were set, so Kubernetes assigned the lowest quality of service class. These pods are first to be evicted under memory pressure.
Source: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/
Step 6 — The QoS class change
This is the thing that was insightful for me. I deleted the bare pod and redeployed it with resource requests and limits defined in a YAML manifest:
apiVersion: v1
kind: Pod
metadata:
name: hello-nginx
namespace: default
spec:
containers:
- name: hello-nginx
image: nginx:alpine
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "200m"
kubectl apply -f hello-nginx.yaml
kubectl describe pod hello-nginx
Two things changed in the describe output:
Requests and limits now appear:
Limits:
cpu: 200m
memory: 128Mi
Requests:
cpu: 100m
memory: 64Mi
QoS Class changed:
QoS Class: Burstable
Reading the documentation told me this would happen. Seeing it change in my own cluster output made it real.
The three QoS classes:
| Class | When assigned | Eviction priority |
|---|---|---|
Guaranteed |
requests == limits for all containers | Last evicted |
Burstable |
requests set but lower than limits | Middle |
BestEffort |
no requests or limits | First evicted |
This matters for production platform work. A pod with no resource limits in a shared cluster is a neighbour problem — it can consume unbounded resources and starve other workloads. Setting limits is not optional on a platform that other teams depend on.
Source: https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/
Source : https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
Step 7 — Checking actual resource usage
yvette@newerkey-lab:~ $ kubectl top pod hello-nginx
NAME CPU(cores) MEMORY(bytes)
hello-nginx 0m 4Mi
yvette@newerkey-lab:~ $ kubectl top node
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
newerkey-lab 129m 3% 1331Mi 35%
The pod is using 4Mi against a 128Mi limit — well within bounds. If it hit 128Mi, the Linux kernel OOM killer would terminate the container process and Kubernetes would restart it. Repeated OOMKills produce CrashLoopBackOff.
The node is using 1331Mi of 4GB — 35%. That leaves roughly 2.4GB free for the rest of the work in this series.
Source: https://kubernetes.io/docs/reference/kubectl/generated/kubectl\_top/
The four kubectl commands to know
These are the common commands I used:
# List resources
kubectl get pods
kubectl get pods --all-namespaces
kubectl get pods --watch
# Detailed info — use this first when something is wrong
kubectl describe pod hello-nginx
# Application logs
kubectl logs hello-nginx
# Shell into a running container
kubectl exec -it hello-nginx -- /bin/sh
kubectl describe is the most important debugging command. The Events section shows exactly what happened at each stage — whether the pod failed to schedule, whether the image pull failed, whether the container crashed on start.
Source: https://kubernetes.io/docs/reference/kubectl/quick-reference/
What's running now
newerkey-lab (Raspberry Pi 4, 4GB)
└── K3s v1.35.5+k3s1
├── kube-system: coredns, traefik, metrics-server, local-path-provisioner
└── default: hello-nginx (nginx:alpine, Burstable QoS, 4Mi/128Mi memory)
Node utilisation: 129m CPU (3%), 1331Mi memory (35%)
kubectl: working without sudo, permanent via systemd override
What I learned today
Kubernetes concepts land differently hands-on but still a lot to take in so don't rush to understand all at once. QoS classes, pod scheduling pipeline, resource requests — I'd read about all of these. Watching the QoS class change from BestEffort to Burstable in my own describe output, against a pod I just deployed, made the concept stick in a way documentation alone doesn't.
Debugging is normal. The kubectl permissions issue took longer than the K3s install itself. The solution — a systemd service override — is more robust than the documented config.yaml approach. Sometimes the detour teaches you more than the happy path would have.
Image caching is visible. First deployment: Pulling image "nginx:alpine" in 4.797s. Second deployment: Container image "nginx:alpine" already present on machine. That's the image cache working — and the reason CronJob pods sometimes start slowly on nodes that haven't cached the image yet.
What's next
Private Docker Registry. Deploy a private container registry inside the K3s cluster with persistent storage, configure K3s to trust it, and push the first image from the laptop into the cluster.


