NAME
Rex::Rancher - Rancher Kubernetes (RKE2/K3s) deployment automation for Rex
VERSION
version 0.001
SYNOPSIS
use Rex -feature => ['1.4'];
use Rex::Rancher;
# Deploy RKE2 control plane (no GPU)
task "deploy_server", sub {
rancher_deploy_server(
distribution => 'rke2',
hostname => 'cp-01',
domain => 'k8s.example.com',
token => 'my-secret',
tls_san => 'k8s.example.com',
kubeconfig_file => "$ENV{HOME}/.kube/mycluster.yaml",
);
};
# Deploy RKE2 control plane with GPU support
task "deploy_gpu_server", sub {
rancher_deploy_server(
distribution => 'rke2',
gpu => 1, # requires Rex::GPU installed
reboot => 1, # reboot after driver install (first deploy)
hostname => 'gpu-cp-01',
domain => 'k8s.example.com',
token => 'my-secret',
tls_san => 'gpu-cp-01.k8s.example.com',
kubeconfig_file => "$ENV{HOME}/.kube/gpu-cluster.yaml",
);
};
# Deploy K3s worker with GPU support
task "deploy_gpu_worker", sub {
rancher_deploy_agent(
distribution => 'k3s',
gpu => 1, # requires Rex::GPU installed
hostname => 'gpu-01',
domain => 'k8s.example.com',
server => 'https://10.0.0.1:6443',
token => 'K10...',
);
};
# Deploy a single-node cluster (control plane + workloads on same node)
task "deploy_single_node", sub {
rancher_deploy_server(
distribution => 'rke2',
token => 'my-secret',
tls_san => '10.0.0.1',
kubeconfig_file => "$ENV{HOME}/.kube/single.yaml",
);
# Remove control-plane taint so workloads can be scheduled
untaint_node(kubeconfig => "$ENV{HOME}/.kube/single.yaml");
};
DESCRIPTION
Rex::Rancher provides complete, zero-touch Kubernetes cluster deployment for Rancher distributions (RKE2 and K3s) using the Rex orchestration framework. It handles everything from raw Linux node preparation through to a running CNI and GPU device plugin.
GPU support is optional. Pass gpu => 1 and install Rex::GPU separately. Rex::Rancher works identically for non-GPU nodes.
When deploying a GPU server node, the full pipeline runs automatically:
- 1. Node preparation — hostname, timezone, locale, NTP, swap off, kernel modules (br_netfilter, overlay), sysctl for Kubernetes networking.
- 2. GPU setup (
gpu => 1) — NVIDIA driver via DKMS, optional reboot, Container Toolkit, CDI specs, containerd runtime config. Handled by Rex::GPU. - 3. Cluster bring-up — write config, run RKE2 or K3s install script, wait for kubeconfig file on the remote host, fetch and save it locally, wait for API server readiness via Kubernetes::REST.
- 4. Cilium CNI — Cilium CLI installed on the remote host, Cilium deployed with distribution-appropriate Helm values.
- 5. NVIDIA device plugin (
gpu => 1+kubeconfig_file) — DaemonSet applied via the Kubernetes API, wait fornvidia.com/gpucapacity on the node. Nokubectlrequired anywhere.
All Kubernetes API operations (steps 3 and 5) run locally on the machine executing Rex using Kubernetes::REST and IO::K8s. No kubectl binary is needed on the remote host.
This distribution supports hosts without an SFTP subsystem (common on Hetzner dedicated servers). Use set connection => "LibSSH" and install Rex::LibSSH.
For fine-grained control, use the individual modules directly:
- Rex::Rancher::Node — Node preparation
- Rex::Rancher::Server — Control plane installation and config retrieval
- Rex::Rancher::Agent — Worker node installation
- Rex::Rancher::Cilium — Cilium CNI installation and upgrade
- Rex::Rancher::K8s — Kubernetes API operations (device plugin, readiness, untaint)
rancher_deploy_server(%opts)
Full control plane deployment in a single call: prepare the node, optionally set up GPU support, install the Kubernetes distribution, wait for the API, install Cilium CNI, and deploy the NVIDIA device plugin.
When gpu => 1 is passed and Rex::GPU is installed, GPU detection and driver installation are performed automatically as step 2 before the cluster is brought up. After Cilium is running, the NVIDIA device plugin DaemonSet is deployed via the local Kubernetes API (no kubectl required on the remote host) and the function waits for nvidia.com/gpu resources to appear on the node.
The full pipeline for a GPU server deployment:
- 1.
prepare_node— hostname, timezone, swap off, kernel modules, sysctl - 2.
gpu_setup(only withgpu => 1) — driver + toolkit + CDI + containerd config - 3.
install_server— write config, run installer, wait for kubeconfig file - 4. Fetch kubeconfig locally, patch
127.0.0.1to the real server address, save tokubeconfig_file, wait for API with "wait_for_api" in Rex::Rancher::K8s - 5.
install_cilium— install Cilium CLI on remote, apply viacilium install - 6.
deploy_nvidia_device_plugin(only withgpu => 1andkubeconfig_file)
Options:
distribution-
Kubernetes distribution to install.
rke2(default) ork3s. gpu-
If true, detect GPUs and run the full GPU setup pipeline via Rex::GPU before installing the Kubernetes distribution. Requires Rex::GPU to be installed. Default:
0. reboot-
If true, reboot the host after GPU driver installation and wait for it to come back before proceeding. Only meaningful with
gpu => 1. Required on first deploy whennouveauwas previously loaded. Default:0. hostname-
Short hostname to set on the node (optional). If omitted, the existing hostname is left unchanged.
domain-
Domain suffix for the FQDN (optional). Used together with
hostnameto set/etc/hosts. Ifhostnameis given withoutdomain, hostname is still set but no hosts entry is written. timezone-
Timezone string, e.g.
Europe/Berlin. Default:UTC. token-
Shared cluster secret used for node joining. Auto-generated if omitted.
tls_san-
Additional TLS Subject Alternative Names for the API server certificate. Accepts a string (single SAN or comma-separated list) or an arrayref. The first SAN is used as the server address when patching the kubeconfig (see
kubeconfig_filebelow). kubeconfig_file-
Local file path where the cluster kubeconfig is saved after the server is running. Required for the NVIDIA device plugin step to work. Optional — if omitted no local kubeconfig is saved and device plugin deployment is skipped even when
gpu => 1.RKE2 and K3s write
https://127.0.0.1into the kubeconfig. The firsttls_sanentry (orkubeconfig_serverif provided) is substituted for127.0.0.1so the saved file connects to the real server address. kubeconfig_server-
Explicit server address to use when patching the kubeconfig. Overrides the
tls_san-based default. node_labels-
Node labels to apply, as an arrayref of
key=valuestrings. registries-
Private registry mirror configuration hashref, written to
registries.yaml. See "install_server" in Rex::Rancher::Server for the structure. cilium-
Whether to configure Cilium CNI. Default:
1. Set to0to keep the distribution's built-in CNI (Canal for RKE2, Flannel for K3s).
rancher_deploy_agent(%opts)
Full worker node deployment: prepare the node, optionally set up GPU support, install the Kubernetes agent, and join the existing cluster.
The pipeline is shorter than "rancher_deploy_server" — there is no Cilium installation or kubeconfig retrieval. GPU support via gpu => 1 works identically to the server case.
Options: same as "rancher_deploy_server" plus:
server-
URL of the server to join. For RKE2:
https://SERVER_IP:9345. For K3s:https://SERVER_IP:6443. Required. token-
Node join token. Obtain from the server with "get_token" in Rex::Rancher::Server. Required.
node_name-
Override the node name registered in Kubernetes (optional).
SEE ALSO
Rex, Rex::LibSSH, Rex::GPU, Rex::Rancher::K8s, Kubernetes::REST, IO::K8s
SUPPORT
Issues
Please report bugs and feature requests on GitHub at https://github.com/Getty/rex-rancher/issues.
CONTRIBUTING
Contributions are welcome! Please fork the repository and submit a pull request.
AUTHOR
Torsten Raudssus <getty@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2026 by Torsten Raudssus <torsten@raudssus.de> https://raudssus.de/.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.