Getting started with Kops

I’ve been trying to learn more around the kubernetes ecosystem and that implies to understand different tools. I already worked in the past with EKS(AWS Elastic Kubernetes Service), but this time I decided to install a k8s cluster using kops. This tool allows to create/destroy production-ready kubernetes clusters and control the operationl tasks such upgrading the nodes.

This post aims to gather all the steps, but the documentation should be always the source of truth.

Pre-requesites

First of all, download a stable relase of kops and put in your PATH. One of the first steps is to setup and IAM user (kops) with enough permissions to manage our k8s infrastructure. This would require:

  • EC2 Full Access
  • Route53 Full Access
  • S3 Full Access
  • IAM Full Access
  • VPC Full Access

The idea is to create an IAM Group, attach those policies and add the user kops to that group.

aws iam create-group --group-name kops
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess --group-name kops
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonRoute53FullAccess --group-name kops
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess --group-name kops
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/IAMFullAccess --group-name kops
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonVPCFullAccess --group-name kops
aws iam create-user --user-name kops
aws iam add-user-to-group --user-name kops --group-name kops
aws iam create-access-key --user-name kops
aws s3api create-bucket \
    --bucket kops-wipefs-store \
    --region us-east-1

The last step prompts your aws_acess_key_id and aws_secret_acess_key, instead of exporting those values as environmental variables, I’d suggest you define a profile in your ~/.aws/credentials

[kops]
aws_access_key_id = {{ aws_access_key_id }}
aws_secret_access_key = {{ aws_secret_acess_key }}

Afterwards, just export AWS_PROFILE=kops and verify is working.

$ aws sts get-caller-identity
{
    "UserId": "$SOME_USER_ID",
    "Account": "$ACCOUNT_ID",
    "Arn": "arn:aws:iam::$ACCOUNT_ID:user/kops"
}

Setting up DNS

The official docs give you several options:

  • Using a domain hosted via Route53
  • Using a subdomain under a domain hosted via Route53
  • Transfer your domain to Route53
  • Subdomain delegation using AWS DNS

I don’t have any domain under AWS Route53, and for these tests I wasn’t going to transfer any domain. The only option left was to do subdomain delegation.

This means that you create a subdomain under your domain/registrant but the nameservers are managed by AWS. I have a domain with Cloudflare and it worked instantly . Create a Route53 zone for the subdomain you want, in the example below, kops.wipefs.com.

ID=$(uuidgen) && aws route53 create-hosted-zone --name kops.wipefs.com --caller-reference $ID | jq .DelegationSet.NameServers
[    
  "ns-652.awsdns-17.net",    
  "ns-1233.awsdns-26.org",    
  "ns-400.awsdns-50.com",    
  "ns-1730.awsdns-24.co.uk"    
]

This prompts the DNS records to add for that subdomain in Cloudflare.

If everything went well:

$ dig ns kops.wipefs.com +short
ns-1730.awsdns-24.co.uk.
ns-400.awsdns-50.com.
ns-652.awsdns-17.net.
ns-1233.awsdns-26.org.

Boostraping a kubernetes cluster

The setup requires to export a few env variables. If you defined an AWS profile, go ahead and export it as well.

export AWS_PROFILE=kops
export NAME=cluster00.kops.wipefs.com
export KOPS_STATE_STORE=s3://kops-wipefs-store

Keep in mind that each time you interact with kops the KOPS_STATE_STORE must available to your environment. Optionally, you can pass the --state flag to kops. If you want to customize your control plane or data plane, this is the right time. I decided to test a new cluster with 3 Master Nodes and 3 Worker Nodes with different instance size, pretty smooth sailing.

$ kops create cluster  --name $NAME --state s3://kops-hnd-store --master-count 3 --master-size m5.large --node-count 3 --node-size m5.xlarge  --kubernetes-version 1.18.3 --zones eu-central-1b

I0627 14:42:35.214864  260016 create_cluster.go:562] Inferred --cloud=aws from zone "eu-central-1b"
W0627 14:42:35.214958  260016 create_cluster.go:778] Running with masters in the same AZs; redundancy will be reduced
I0627 14:42:35.435383  260016 subnets.go:184] Assigned CIDR 172.20.32.0/19 to subnet eu-central-1b
Previewing changes that will be made:


SSH public key must be specified when running with AWS (create with `kops create secret --name cluster01.kops.wipefs.com sshpublickey admin -i ~/.ssh/id_rsa.pub`)

A SSH public key needs to be added for this cluster:

$ kops create secret --name cluster01.kops.wipefs.com sshpublickey admin -i ~/.ssh/id_kops.pub  --state s3://kops-wipefs-store

One way to see a dry-run is easily runnning kops update cluster --name $NAME. In order to bootstrap a cluster, add the --yes flag to the previous command.

$ kops update cluster --name $NAME --yes

After around 10-15min you can verify if the cluster is up and running.

$ kops validate cluster
                          
Using cluster from kubectl context: cluster01.kops.wipefs.com

Validating cluster cluster01.kops.wipefs.com

INSTANCE GROUPS
NAME			ROLE	MACHINETYPE	MIN	MAX	SUBNETS
master-eu-central-1b-1	Master	m5.large	1	1	eu-central-1b
master-eu-central-1b-2	Master	m5.large	1	1	eu-central-1b
master-eu-central-1b-3	Master	m5.large	1	1	eu-central-1b
nodes			Node	m5.xlarge	3	3	eu-central-1b

NODE STATUS
NAME						ROLE	READY
ip-172-20-32-150.eu-central-1.compute.internal	master	True
ip-172-20-34-242.eu-central-1.compute.internal	node	True
ip-172-20-47-125.eu-central-1.compute.internal	node	True
ip-172-20-48-156.eu-central-1.compute.internal	node	True
ip-172-20-58-243.eu-central-1.compute.internal	master	True
ip-172-20-59-206.eu-central-1.compute.internal	master	True

Your cluster cluster01.kops.wipefs.com is ready

Because we are using subdomain delegation, our kube-apiserver endpoint is publicly exposed, therefore consider the security implications of this method.

Metrics server

In the official repository you can find the documentation about how to enable the metrics server. After enabling serviceaccount tokens for the kubelet, proceed and update the cluster.

$ kops update cluster --yes 
Using cluster from kubectl context: cluster01.kops.wipefs.com

I0627 15:12:19.277916  305408 executor.go:103] Tasks: 0 done / 96 total; 50 can run
I0627 15:12:20.624121  305408 executor.go:103] Tasks: 50 done / 96 total; 24 can run
I0627 15:12:21.129271  305408 executor.go:103] Tasks: 74 done / 96 total; 18 can run
I0627 15:12:22.973913  305408 executor.go:103] Tasks: 92 done / 96 total; 4 can run
I0627 15:12:23.313143  305408 executor.go:103] Tasks: 96 done / 96 total; 0 can run
I0627 15:12:23.313169  305408 dns.go:155] Pre-creating DNS records
I0627 15:12:24.007260  305408 update_cluster.go:305] Exporting kubecfg for cluster
kops has set your kubectl context to cluster01.kops.wipefs.com

Cluster changes have been applied to the cloud.


Changes may require instances to restart: kops rolling-update cluster

Everything is ready to run a rolling update.

$ kops rolling-update cluster --yes

Using cluster from kubectl context: cluster01.kops.wipefs.com

NAME			STATUS		NEEDUPDATE	READY	MIN	MAX	NODES
master-eu-central-1b-1	NeedsUpdate	1		0	1	1	1
master-eu-central-1b-2	NeedsUpdate	1		0	1	1	1
master-eu-central-1b-3	NeedsUpdate	1		0	1	1	1
nodes			NeedsUpdate	3		0	3	3	3

I noticed sometimes the update gets stucked, simply re-run and wait. Finally, let’s add metrics-server.

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/metrics-server/v1.16.x.yaml

After a few minutes:

$ kubectl too nodes              
NAME                                             CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ip-172-20-36-96.eu-central-1.compute.internal    23m          0%     934Mi           5%        
ip-172-20-41-96.eu-central-1.compute.internal    41m          1%     979Mi           6%        
ip-172-20-45-182.eu-central-1.compute.internal   86m          4%     1629Mi          21%       
ip-172-20-47-69.eu-central-1.compute.internal    44m          1%     1008Mi          6%        
ip-172-20-54-204.eu-central-1.compute.internal   114m         5%     1589Mi          20%       
ip-172-20-58-197.eu-central-1.compute.internal   92m          4%     1597Mi          21%       

This is a pretty basic guide so I remember what I did in case I need it. However, I always suggest to go through the official documentation. Do not forget to remove the cluster if you are just testing

 kops delete cluster --name $NAME --y