Skip to main content

Designing a Cluster

Important questions when creating a Kubernetes cluster:

  1. What is the purpose of this cluster?

    • Learning? Try minikube, or a single node cluster with kubeadm/gcp/aws
    • Dev/Test? Multi-node cluster with a single master and multiple workers Setup with kubeadm tool or quick provision on GCP, AWS, Azure
    • Production application?
  2. Cloud or OnPrem? Which cloud or what hardware? GKE for GCP, Kops for AWS, AKS for Azure

  3. Workload Analysis 4. How many workloads? 5. What kind? 6. Web? 7. Big Data/Analytics? 8. GPU Needs? 9. Application Resource Requirements 10. CPU 11. Memory 12. Traffic 13. Heavy Traffic? 14. Bursting Traffic?

!Windows

Kubernetes does not run natively on Windows

Hosting Production Grade

  1. High Availability Multi Node cluster with multiple controller nodes
  2. Kubeadm or GCP or Kops on AWS or other supported platforms
  3. Up to 5,000 nodes
  4. Up to 150,000 pods in the cluster
  5. Up to 300,000 Total Containers
  6. Up to 100 Pods per Node

Resource considerations:

NodesvCPUMemory GB
1-513.75
6-1027.5
11-100415
101-250830
251 - 5001660
> 50032120

Storage consideration

  • High performance SSDs
  • Multiple concurrent connections - network based storage
  • Persistent shared volumes
  • Label nodes with specific disk types
  • Use node selectors to assign applications to nodes with specific disk types

In large clusters you can separate etcd into its own nodes.

Deployment Tools

ToolCan Create VMsMulti Node
minikube
kubeadm
kops
Vagrant

Turnkey Solutions

  • OpenShift
  • Cloud Foundary Container Runtime
  • VMWare Cloud PKs

Hosted Solutions

  • Google Container Engine (GKE)
  • OpenShift Online
  • Azure Kubernetes Service
  • Amazon Elastic Container Service for Kubernetes (EKS)

High Availability

When the controller goes down, workers will keep the status quo. Until things go wrong.

ComponentHA Style
API ServerActive/Active (LB)
Controller ManagerLeader/Follower (--leader-elect)
SchedulerLeader/Follower (--leader-elect)
ETCDStacked / External

Kube-Controller-Manager and Scheduler: --leader-elect true

  • Whichever process first updates the kube-controller-manager-endpoint with its lock, becomse the active. The lock is held for the lease seconds in --leader-elect-lease-duration (default 15) and renews the lease every --leader-elect-renew-deadline seconds (default 10)

Both procesess try to become the leader every 2 seconds (based on --leader-elect-try-period)

ETCD

Stacked Topology : ETCD runs on the controller External ETCD : Less Risky, harder to setup, 2x more expensive