markdown

Self‑Hosted HA Kubernetes Cluster

Home‑grown, production‑grade Kubernetes platform built from the ground up

ha-cluster.jpg

Project Synopsis
Designed, provisioned, and operated a high‑availability (HA) Kubernetes cluster on two legacy machines (an i5 desktop + 32 GB RAM and an i7 laptop + 16 GB RAM). The whole stack is managed with Infrastructure‑as‑Code (IaC), GitOps, and modern cloud‑native tooling, turning a home lab into a fully functional private cloud that supports dozens of services (Traefik, NFS storage, Argo CD, Infisical, etc.).

Key Highlights

AreaWhat I DidTools & TechniquesImpact
Hardware & OSRepurposed old hardware, installed Debian 12, tuned SSH (custom port, key‑only auth, X11 forwarding)Debian, Zsh + Oh‑My‑Zsh, TmuxSecure, low‑overhead foundation for the cluster.
VirtualisationCreated 4 VMs (2 per host) with 2 CPU and 8 GiB RAM each, using bridged networking so each VM gets its own IP.KVM, libvirt, Vagrant (Ruby DSL)Fast provisioning, isolated environments for control‑plane and workers.
Cluster BootstrapReplaced manual kubeadm steps with Kubespray (Ansible) to install a 2‑node control plane + 3 etcd replicas.Kubespray, Ansible, Cert‑manager, Cilium CNI, MetalLB (layer‑2)Fully HA control plane, automatic node scaling, reliable service IP allocation.
Load Balancing & IngressDeployed MetalLB (layer‑2) as a bare‑metal load balancer, limited to the control‑plane nodes via node selectors.MetalLB v0.13.9, custom address pool 192.168.xx.10‑50Provides stable external IPs for all services without cloud LB.
StorageSet up an NFS share on the primary host and installed the kubernetes‑nfs‑provisioner as the default StorageClass.NFS, nfs‑subdir‑external‑provisionerPersistent volumes shared across all worker nodes.
Secrets ManagementIntegrated Infisical via its Kubernetes operator to inject secrets as env variables or files.Infisical operator, PostgreSQL backendCentralised, secure secret handling for all apps.
GitOps & CI/CDInstalled Argo CD (Helm) to sync manifests from GitHub, using Kustomize for versioning. Fixed sync issues caused by Infisical polling.Argo CD, Kustomize, GitHub ActionsZero‑touch deployments; rollout time dropped from hours to minutes.
Ingress RouterAdopted Traefik (Helm) with IngressRoute CRDs for URL‑based routing; roadmap to Gateway API.Traefik, HelmFlexible, per‑domain routing with automatic TLS (future).
Observability (planned)Future integration of OpenTelemetry, Prometheus, Grafana, Jaeger, and Checkov for security scanning.Will provide full telemetry and compliance.
Documentation & AutomationAuthored a detailed tutorial (the source of this description) and automated repeatable builds; CI pipeline reduces cluster rebuild from > 3 h to < 2 min.Markdown, Mermaid diagrams, CI scriptsKnowledge transfer and rapid recovery.

Why This Project Stands Out

  • HA Control Plane – 2 control‑plane nodes + 3 etcd members guarantee quorum and resilience.
  • Pure‑IaC Lifecycle – Vagrant + Kubespray + Ansible allow a declarative, version‑controlled infrastructure.
  • GitOps‑First – Argo CD continuously reconciles the desired state, cutting deployment time dramatically.
  • Production‑Ready Services – Traefik, MetalLB, NFS, and Infisical give the cluster all the plumbing needed for real workloads.
  • Self‑Hosted Philosophy – Shows mastery of end‑to‑end cloud stack without relying on any public provider, aligning with modern “edge‑cloud” trends.