Frequently Asked Questions

Deployment failed at install prometheus-operator: what should I do?

On slow networks or overloaded systems, prometheus-operator installation can time out, causing the deployment to fail. If this happens, you must follow these steps before restarting the playbook.

  1. Open a make shell environment with KUBECONFIG set.
  2. Delete and purge the prometheus-operator Helm release:
helm delete --purge prometheus-operator
  1. Delete the prometheus-operator-create-sm Job:
kubectl --namespace=kube-ops delete job prometheus-operator-create-sm

If the above command fails with Error from server (NotFound), this is OK.

  1. Delete the prometheus-operator-get-crd Job:
kubectl --namespace=kube-ops delete job prometheus-operator-get-crd

If the above command fails with Error from server (NotFound), this is OK.

Re-run the playbook to finalize the deployment.

How can I keep a logfile of Ansible executions?

There are two ways to configure Ansible to keep a logfile:

  • Set log_path in the defaults section of ansible.cfg. Relative paths are relative to the location of ansible.cfg.
  • Export ANSIBLE_LOG_PATH in the environment from which ansible-playbook will be invoked.

For more information, see DEFAULT_LOG_PATH.

Can I use the ‘root’ user to deploy MetalK8s to servers?

During the deployment of MetalK8s, a set of tasks are executed to bring the target system in line with the RHEL7 STIG security guidelines, using the ansible-hardening role. STIG rule V-72247 does not permit remote SSH access using the root user. As such, if MetalK8s were deployed using root to access a remote system, this would effectively disable access to said server.

We integrated a check in the playbook to assert ansible_user is not set to root on any of the target hosts to abort the deployment if this configuration is detected.

To disable this security measure, set the security_sshd_permit_root_login variable to true on the relevant hosts or groups.