Cluster expansion¶
Once the Bootstrap node has been installed
(see Deployment of the Bootstrap node), the cluster can be expanded.
Unlike the kubeadm join
approach which relies on bootstrap tokens and
manual operations on each node, MetalK8s uses Salt SSH to setup new
Nodes through declarative configuration,
from a single entrypoint. This operation can be done either through
the MetalK8s GUI or
the command-line.
Defining an Architecture¶
Follow the recommendations provided in the introduction to choose an architecture.
List the machines to deploy and their associated roles, and deploy each of them using the following process, either from the GUI or CLI. Note however, that the finest control over roles and taints can only be achieved using the command-line.
Adding a Node with the MetalK8s GUI¶
To reach the UI, refer to this procedure.
Creating a Node Object¶
The first step to adding a Node to a cluster is to declare it in the API. The MetalK8s GUI provides a simple form for that purpose.
Navigate to the Node list page, by clicking the button in the sidebar:
From the Node list (the Bootstrap node should be visible there), click the button labeled “Create a New Node”:
Fill the form with relevant information (make sure the SSH provisioning for the Bootstrap node is done first):
Name: the hostname of the new Node
SSH User: the user for which the Bootstrap has SSH access
Hostname or IP: the address to use for SSH from the Bootstrap
SSH Port: the port to use for SSH from the Bootstrap
SSH Key Path: the path to the private key generated in this procedure
Sudo required: whether the SSH deployment will need
sudo
accessRoles/Workload Plane: enable any workload applications run on this Node
Roles/Control Plane: enable master and etcd services run on this Node
Roles/Infra: enable infra services run on this Node
Note
Combination of multiple roles is possible: Selecting Workload Plane and Infra checkbox will result in infra services and workload applications run on this Node.
Click Create. You will be redirected to the Node list page, and will be shown a notification to confirm the Node creation:
Deploying the Node¶
After the desired state has been declared, it can be applied to the machine. The MetalK8s GUI uses SaltAPI to orchestrate the deployment.
From the Node list page, click the Deploy button for any Node that has not yet been deployed.
Once clicked, the button changes to Deploying. Click it again to open the deployment status page:
Detailed events are shown on the right of this page, for advanced users to debug in case of errors.
Todo
UI should parse these events further
Events should be documented
When deployment is complete, click Back to nodes list. The new Node should be in a Ready state.
Todo
troubleshooting (example errors)
Adding a Node from the Command-line¶
Creating a Manifest¶
Adding a node requires the creation of a manifest file, following the template below:
apiVersion: v1 kind: Node metadata: name: <node_name> annotations: metalk8s.scality.com/ssh-key-path: /etc/metalk8s/pki/salt-bootstrap metalk8s.scality.com/ssh-host: <node control plane IP> metalk8s.scality.com/ssh-sudo: 'false' metalk8s.scality.com/ssh-user: 'root' labels: metalk8s.scality.com/version: '127.0.4-dev' <role labels> spec: taints: <taints>
Annotations are used by Salt-SSH to connect to the node and deploy it.
All annotations are prefixed with metalk8s.scality.com/
:
Annotation |
Description |
Default |
---|---|---|
ssh-host |
Control plane IP of the node, must be accessible over SSH from the Bootstrap node |
None |
ssh-key-path |
Path to the private SSH key used to connect to the node |
None |
ssh-sudo |
Whether to use |
|
ssh-user |
User to connect to the node and run commands |
|
The combination of <role labels>
and <taints>
will determine what is
installed and deployed on the Node.
roles determine a Node responsibilities. taints are complementary to roles.
A node exclusively in the control plane with
etcd
storageroles and taints both are set to master and etcd. It has the same behavior as the Control Plane checkbox in the GUI.
[…]
metadata:
[…]
labels:
node-role.kubernetes.io/master: ''
node-role.kubernetes.io/etcd: ''
[… (other labels except roles)]
spec:
[…]
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
- effect: NoSchedule
key: node-role.kubernetes.io/etcd
A worker node dedicated to
infra
services (see Introduction)roles and taints both are set to infra. It has the same behavior as the Infra checkbox in the GUI.
[…]
metadata:
[…]
labels:
node-role.kubernetes.io/infra: ''
[… (other labels except roles)]
spec:
[…]
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
A simple worker still accepting
infra
services would use the same role label without the taintroles are set to node and infra. It’s the same as the checkbox of Workload Plane and Infra in MetalK8s GUI.
CLI-only actions¶
A Node dedicated to etcd
roles and taints both are set to etcd.
[…]
metadata:
[…]
labels:
node-role.kubernetes.io/etcd: ''
[… (other labels except roles)]
spec:
[…]
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/etcd
Creating the Node Object¶
Use kubectl
to send the manifest file created before to Kubernetes API.
root@bootstrap $ kubectl --kubeconfig /etc/kubernetes/admin.conf apply -f <path-to-node-manifest>
node/<node-name> created
Check that it is available in the API and has the expected roles.
root@bootstrap $ kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes
NAME STATUS ROLES AGE VERSION
bootstrap Ready bootstrap,etcd,infra,master 12d v1.11.7
<node-name> Unknown <expected node roles> 29s
Deploying the Node¶
Open a terminal in the Salt Master container using this procedure.
Check that SSH access from the Salt Master to the new node is properly configured (see SSH Provisioning).
Note
Salt SSH requires Python 3 to be installed on the remote host to run Salt functions. It will be installed automatically when deploying the node, though you can send raw shell commands before (using
--raw-shell
) if needed.root@salt-master-bootstrap $ salt-ssh --roster=kubernetes <node_name> --raw-shell 'echo OK' <node_name>: ---------- retcode: 0 stderr: Warning: Permanently added '<ip>' (ECDSA) to the list of known hosts. stdout: OK
Start the node deployment.
root@salt-master-bootstrap $ salt-run state.orchestrate metalk8s.orchestrate.deploy_node \ saltenv=metalk8s-127.0.4-dev \ pillar='{"orchestrate": {"node_name": "<node-name>"}}' ... lots of output ... Summary for bootstrap_master ------------ Succeeded: 7 (changed=7) Failed: 0 ------------ Total states run: 7 Total run time: 121.468 s
Todo
Troubleshooting section
explain orchestrate output and how to find errors
point to log files
Checking Cluster Health¶
During the expansion, it is recommended to check the cluster state between each node addition.
When expanding the control plane, one can check the etcd cluster health:
root@bootstrap $ kubectl -n kube-system exec -ti etcd-bootstrap sh --kubeconfig /etc/kubernetes/admin.conf
root@etcd-bootstrap $ etcdctl --endpoints=https://[127.0.0.1]:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
endpoint health --cluster
https://<first-node-ip>:2379 is healthy: successfully committed proposal: took = 16.285672ms
https://<second-node-ip>:2379 is healthy: successfully committed proposal: took = 43.462092ms
https://<third-node-ip>:2379 is healthy: successfully committed proposal: took = 52.879358ms
Todo
add sanity checks for Pods lists (also in the relevant sections in services)