Cluster expansion¶
Once the Bootstrap node has been installed
(see Deployment of the Bootstrap node), the cluster can be expanded.
Unlike the kubeadm join
approach which relies on bootstrap tokens and
manual operations on each node, MetalK8s uses Salt SSH to setup new
Nodes through declarative configuration,
from a single entrypoint. This operation can be done through
the command-line.
Defining an architecture¶
See the schema defined in the introduction.
The Bootstrap being already deployed, the deployment of other Nodes will need to happen four times, twice for control-plane Nodes (bringing up the control-plane to a total of three members), and twice for workload-plane Nodes.
Todo
explain architecture: 3 control-plane + etcd, 2 workers (one being dedicated for infra)
remind roles and taints from intro
Adding a node from the command-line¶
Creating a manifest¶
Adding a node requires the creation of a manifest file, following the template below:
apiVersion: v1 kind: Node metadata: name: <node_name> annotations: metalk8s.scality.com/ssh-key-path: /etc/metalk8s/pki/salt-bootstrap metalk8s.scality.com/ssh-host: <node control-plane IP> metalk8s.scality.com/ssh-sudo: 'false' labels: metalk8s.scality.com/version: '2.2.0-dev' <role labels> spec: taints: <taints>
The combination of <role labels>
and <taints>
will determine what is
installed and deployed on the Node.
A node exclusively in the control-plane with etcd
storage will have:
[…]
metadata:
[…]
labels:
node-role.kubernetes.io/master: ''
node-role.kubernetes.io/etcd: ''
[… (other labels except roles)]
spec:
[…]
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
- effect: NoSchedule
key: node-role.kubernetes.io/etcd
A worker node dedicated to infra
services (see Introduction) will
use:
[…]
metadata:
[…]
labels:
node-role.kubernetes.io/infra: ''
[… (other labels except roles)]
spec:
[…]
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
A simple worker still accepting infra
services would use the same role
label without the taint.
Creating the Node object¶
Use kubectl
to send the manifest file created before to Kubernetes API.
root@bootstrap $ kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f <path-to-node-manifest>
node/<node-name> created
Check that it is available in the API and has the expected roles.
root@bootstrap $ kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes
NAME STATUS ROLES AGE VERSION
bootstrap Ready bootstrap,etcd,infra,master 12d v1.11.7
<node-name> Unknown <expected node roles> 29s
Deploying the node¶
Open a terminal in the Salt Master container using this procedure.
Check that SSH access from the Salt Master to the new node is properly configured (see SSH provisioning).
root@salt-master-bootstrap $ salt-ssh --roster kubernetes <node-name> test.ping
<node-name>:
True
Start the node deployment.
root@salt-master-bootstrap $ salt-run state.orchestrate metalk8s.orchestrate.deploy_node \ saltenv=metalk8s-2.2.0-dev \ pillar='{"orchestrate": {"node_name": "<node-name>"}}' ... lots of output ... Summary for bootstrap_master ------------ Succeeded: 7 (changed=7) Failed: 0 ------------ Total states run: 7 Total run time: 121.468 s
Troubleshooting¶
Todo
explain orchestrate output and how to find errors
point to log files
Checking the cluster health¶
During the expansion, it is recommended to check the cluster state between each node addition.
When expanding the control-plane, one can check the etcd cluster health:
root@bootstrap $ kubectl -n kube-system exec -ti etcd-bootstrap sh --kubeconfig /etc/kubernetes/admin.conf
root@etcd-bootstrap $ etcdctl --endpoints=https://[127.0.0.1]:2379 \
--ca-file=/etc/kubernetes/pki/etcd/ca.crt \
--cert-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key-file=/etc/kubernetes/pki/etcd/healthcheck-client.key \
cluster-health
member 46af28ca4af6c465 is healthy: got healthy result from https://172.21.254.6:2379
member 81de403db853107e is healthy: got healthy result from https://172.21.254.7:2379
member 8878627efe0f46be is healthy: got healthy result from https://172.21.254.8:2379
cluster is healthy
Todo
add sanity checks for Pods lists (also in the relevant sections in services)