Skip to content
Vladimir Chavkov
Go back

Rancher: Complete Kubernetes Management Platform Guide

Edit page

Rancher: Complete Kubernetes Management Platform Guide

Rancher is an open-source container management platform that simplifies deploying and managing Kubernetes clusters across any infrastructure. This comprehensive guide covers Rancher installation, cluster management, and production deployment strategies.

What is Rancher?

Rancher provides a complete platform for managing Kubernetes:

Key Features

  1. Multi-Cluster Management: Manage hundreds of clusters from single pane
  2. Cluster Provisioning: Deploy Kubernetes on any infrastructure
  3. Application Catalog: Deploy apps from Helm charts
  4. User Management: Centralized authentication and RBAC
  5. Monitoring: Built-in Prometheus and Grafana
  6. Logging: Centralized log aggregation
  7. CI/CD: Integration with GitOps tools
  8. Multi-Tenancy: Project-based isolation
  9. Backup/Restore: Cluster backup and disaster recovery
  10. RKE/RKE2/K3s: Rancher’s own Kubernetes distributions

Architecture

┌─────────────────────────────────────────────────────────┐
│ Rancher Management Server │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ UI │ │ API │ │ Auth │ │
│ │ (Dashboard) │ │ Server │ │ (LDAP/SAML) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Cluster Controller │ │
│ │ • Provisions clusters │ │
│ │ • Manages applications │ │
│ │ • Syncs cluster state │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────┬────────────────────────────────────┘
┌────────────┼────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Cluster 1 │ │ Cluster 2 │ │ Cluster 3 │
│ (EKS) │ │ (GKE) │ │ (RKE2) │
│ │ │ │ │ │
│ Rancher │ │ Rancher │ │ Rancher │
│ Agent │ │ Agent │ │ Agent │
└─────────────┘ └─────────────┘ └─────────────┘

Installation

Prerequisites

Install with Helm

Terminal window
# Add Rancher Helm repository
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm repo update
# Create namespace
kubectl create namespace cattle-system
# Install cert-manager (if not already installed)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.3/cert-manager.yaml
# Wait for cert-manager
kubectl wait --for=condition=Ready pods --all -n cert-manager --timeout=300s
# Install Rancher
helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.example.com \
--set bootstrapPassword=admin \
--set ingress.tls.source=letsEncrypt \
--set letsEncrypt.email=admin@example.com \
--set letsEncrypt.ingress.class=nginx
# Check rollout status
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get pods
# Get Rancher URL
echo https://rancher.example.com

Using Own Certificates

Terminal window
# Create TLS secret
kubectl -n cattle-system create secret tls tls-rancher-ingress \
--cert=tls.crt \
--key=tls.key
# Install Rancher with custom certs
helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.example.com \
--set ingress.tls.source=secret \
--set privateCA=true

Airgap Installation

Terminal window
# Pull Rancher images
rancher-save-images.sh --image-list rancher-images.txt
# Push to private registry
rancher-load-images.sh --image-list rancher-images.txt \
--registry registry.example.com
# Install from private registry
helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.example.com \
--set rancherImage=registry.example.com/rancher/rancher \
--set systemDefaultRegistry=registry.example.com \
--set useBundledSystemChart=true

Cluster Provisioning

Import Existing Cluster

Terminal window
# From Rancher UI:
# 1. Click "Import Existing"
# 2. Enter cluster name
# 3. Copy and run kubectl command on target cluster
# Example command generated by Rancher:
curl --insecure -sfL https://rancher.example.com/v3/import/xxxxx.yaml | kubectl apply -f -

Create RKE2 Cluster

# RKE2 cluster configuration
apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
name: production-cluster
namespace: fleet-default
spec:
kubernetesVersion: v1.28.5+rke2r1
rkeConfig:
machineGlobalConfig:
cni: calico
disable-kube-proxy: false
etcd-expose-metrics: false
machinePools:
- name: controlplane
quantity: 3
etcdRole: true
controlPlaneRole: true
workerRole: false
machineConfigRef:
kind: VmwarevsphereConfig
name: vsphere-controlplane
- name: worker
quantity: 5
workerRole: true
machineConfigRef:
kind: VmwarevsphereConfig
name: vsphere-worker
registries:
configs:
registry.example.com:
authConfigSecretName: registry-creds
upgradeStrategy:
controlPlaneConcurrency: "1"
workerConcurrency: "2"
controlPlaneDrainOptions:
timeout: 600
deleteEmptyDirData: true
workerDrainOptions:
timeout: 600
deleteEmptyDirData: true

Provision on Cloud Providers

AWS (EKS)

apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
name: eks-cluster
namespace: fleet-default
spec:
cloudCredentialSecretName: aws-credentials
eksConfig:
region: us-east-1
kubernetesVersion: "1.28"
nodeGroups:
- nodegroupName: ng-general
desiredSize: 3
maxSize: 10
minSize: 3
instanceType: t3.large
diskSize: 100
labels:
workload-type: general
- nodegroupName: ng-spot
desiredSize: 5
maxSize: 20
minSize: 2
instanceType: t3.large
capacityType: SPOT
labels:
workload-type: spot
publicAccess: true
privateAccess: true
logging:
types:
- api
- audit
- authenticator
- controllerManager
- scheduler

Azure (AKS)

apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
name: aks-cluster
namespace: fleet-default
spec:
cloudCredentialSecretName: azure-credentials
aksConfig:
resourceGroup: rancher-rg
resourceLocation: eastus
kubernetesVersion: "1.28.5"
nodePools:
- name: system
count: 3
vmSize: Standard_D4s_v5
mode: System
osType: Linux
osDiskSizeGB: 128
maxPods: 50
- name: user
count: 5
vmSize: Standard_D4s_v5
mode: User
enableAutoScaling: true
minCount: 3
maxCount: 20
maxPods: 50
networkPlugin: azure
networkPolicy: azure
loadBalancerSku: standard
monitoring: true

User and Access Management

Authentication Providers

Active Directory/LDAP

# Configure in Rancher UI: Security → Authentication → ActiveDirectory
# Or via API
apiVersion: management.cattle.io/v3
kind: ActiveDirectoryConfig
metadata:
name: activedirectory
enabled: true
servers:
- ldap.example.com
port: 389
tls: true
connectionTimeout: 5000
userSearchBase: dc=example,dc=com
userObjectClass: person
userNameAttribute: sAMAccountName
userSearchAttribute: sAMAccountName
groupSearchBase: dc=example,dc=com
groupObjectClass: group
groupNameAttribute: name
groupSearchAttribute: member

SAML (Okta, Azure AD)

apiVersion: management.cattle.io/v3
kind: SamlConfig
metadata:
name: saml
enabled: true
idpMetadataContent: |
<EntityDescriptor ...>
...
</EntityDescriptor>
spCert: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
spKey: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----

RBAC

Global Roles

# Custom global role
apiVersion: management.cattle.io/v3
kind: GlobalRole
metadata:
name: cluster-provisioner
displayName: Cluster Provisioner
rules:
- apiGroups:
- management.cattle.io
resources:
- clusters
verbs:
- create
- delete
- get
- list
- update
---
# Assign to user
apiVersion: management.cattle.io/v3
kind: GlobalRoleBinding
metadata:
name: john-cluster-provisioner
globalRoleName: cluster-provisioner
userPrincipalName: local://u-xxxxx

Cluster Roles

# Assign cluster role
apiVersion: management.cattle.io/v3
kind: ClusterRoleTemplateBinding
metadata:
name: john-cluster-owner
namespace: c-xxxxx
clusterName: c-xxxxx
roleTemplateName: cluster-owner
userPrincipalName: local://u-xxxxx

Project Roles

# Assign project role
apiVersion: management.cattle.io/v3
kind: ProjectRoleTemplateBinding
metadata:
name: john-project-member
namespace: c-xxxxx
projectName: c-xxxxx:p-xxxxx
roleTemplateName: project-member
userPrincipalName: local://u-xxxxx

Projects and Namespaces

Create Project

apiVersion: management.cattle.io/v3
kind: Project
metadata:
name: production
namespace: c-xxxxx
spec:
clusterName: c-xxxxx
displayName: Production
description: Production workloads
resourceQuota:
limit:
limitsCpu: "10000m"
limitsMemory: "20Gi"
requestsCpu: "5000m"
requestsMemory: "10Gi"
persistentVolumeClaims: "10"
services: "10"
namespaceDefaultResourceQuota:
limit:
limitsCpu: "1000m"
limitsMemory: "2Gi"
requestsCpu: "500m"
requestsMemory: "1Gi"
containerDefaultResourceLimit:
limitsCpu: "500m"
limitsMemory: "512Mi"
requestsCpu: "250m"
requestsMemory: "256Mi"

Network Isolation

# Project network policy
apiVersion: management.cattle.io/v3
kind: ProjectNetworkPolicy
metadata:
name: production-isolation
namespace: c-xxxxx
spec:
projectName: c-xxxxx:p-xxxxx
description: Isolate production project

Application Deployment

App Catalog

Terminal window
# Add Helm chart repository
# UI: Apps → Repositories → Create
# Or via kubectl
kubectl apply -f - <<EOF
apiVersion: catalog.cattle.io/v1
kind: ClusterRepo
metadata:
name: bitnami
spec:
url: https://charts.bitnami.com/bitnami
EOF

Deploy Application

# Deploy from catalog
apiVersion: catalog.cattle.io/v1
kind: App
metadata:
name: postgresql
namespace: production
spec:
chart:
metadata:
name: postgresql
version: 12.x.x
spec:
sourceRepo: bitnami
values: |
auth:
postgresPassword: secretpassword
database: myapp
primary:
persistence:
enabled: true
size: 100Gi
metrics:
enabled: true

Monitoring

Enable Monitoring

# Enable cluster monitoring
apiVersion: management.cattle.io/v3
kind: MonitoringConfig
metadata:
name: cluster-monitoring
namespace: c-xxxxx
spec:
prometheus:
retention: 12h
persistence:
enabled: true
storageClass: default
size: 50Gi
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 500m
memory: 1Gi
grafana:
persistence:
enabled: true
storageClass: default
size: 10Gi

Custom Prometheus Rules

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: custom-alerts
namespace: cattle-monitoring-system
spec:
groups:
- name: custom
rules:
- alert: HighPodCPU
expr: |
sum(rate(container_cpu_usage_seconds_total[5m])) by (pod, namespace) > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.pod }}"

Logging

Enable Logging

# Configure cluster logging
apiVersion: management.cattle.io/v3
kind: ClusterLogging
metadata:
name: cluster-logging
namespace: c-xxxxx
spec:
clusterName: c-xxxxx
elasticsearchConfig:
endpoint: https://elasticsearch.example.com:9200
indexPrefix: rancher
authPassword: password
authUsername: elastic
certificate: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----

FluentBit Configuration

apiVersion: logging.banzaicloud.io/v1beta1
kind: Flow
metadata:
name: app-logs
namespace: production
spec:
filters:
- parser:
parse:
type: json
match:
- select:
labels:
app: my-app
outputRefs:
- elasticsearch

Backup and Disaster Recovery

Backup Configuration

apiVersion: resources.cattle.io/v1
kind: Backup
metadata:
name: daily-backup
namespace: fleet-default
spec:
resourceSetName: rancher-resource-set
schedule: "0 2 * * *"
retentionCount: 30
storageLocation:
s3:
credentialSecretName: s3-creds
credentialSecretNamespace: default
bucketName: rancher-backups
region: us-east-1
folder: production
endpoint: s3.amazonaws.com

Restore from Backup

apiVersion: resources.cattle.io/v1
kind: Restore
metadata:
name: restore-from-backup
namespace: fleet-default
spec:
backupFilename: daily-backup-20260211020000.tar.gz
storageLocation:
s3:
credentialSecretName: s3-creds
credentialSecretNamespace: default
bucketName: rancher-backups
region: us-east-1
folder: production

Continuous Delivery

Fleet GitOps

# GitRepo for Fleet
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: fleet-apps
namespace: fleet-local
spec:
repo: https://github.com/example/fleet-apps
branch: main
paths:
- ./apps
targets:
- name: production
clusterSelector:
matchLabels:
env: production

See separate Rancher Fleet blog post for detailed Fleet information.

High Availability

HA Rancher Setup

Terminal window
# Install Rancher with 3 replicas
helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.example.com \
--set replicas=3 \
--set resources.requests.cpu=1000m \
--set resources.requests.memory=2Gi \
--set resources.limits.cpu=2000m \
--set resources.limits.memory=4Gi

Database Backup

Terminal window
# Backup Rancher data (etcd)
kubectl -n cattle-system exec rancher-xxx -- \
etcdctl snapshot save /tmp/snapshot.db
# Copy snapshot
kubectl -n cattle-system cp rancher-xxx:/tmp/snapshot.db ./snapshot.db

Best Practices

Security

  1. Use RBAC: Define fine-grained access controls
  2. Enable Pod Security Policies: Enforce security standards
  3. Network Policies: Isolate workloads
  4. TLS Everywhere: Use certificates for all communications
  5. Regular Updates: Keep Rancher and clusters updated

Performance

# Rancher server optimization
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi
# Agent resource limits
cattle-cluster-agent:
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi

Multi-Tenancy

  1. Projects: Group namespaces logically
  2. Resource Quotas: Prevent resource exhaustion
  3. Network Isolation: Separate traffic between projects
  4. RBAC: Assign appropriate permissions

Troubleshooting

Terminal window
# Check Rancher pods
kubectl -n cattle-system get pods
kubectl -n cattle-system logs -l app=rancher
# Check agent connection
kubectl -n cattle-system get pods -l app=cattle-cluster-agent
# View cluster events
kubectl get events -A --sort-by='.lastTimestamp'
# Debug cluster connection
curl -k https://rancher.example.com/v3/clusters
# Reset admin password
kubectl -n cattle-system exec -it rancher-xxx -- reset-password

Conclusion

Rancher provides a comprehensive platform for managing Kubernetes at scale. Its multi-cluster capabilities, user-friendly interface, and extensive features make it an excellent choice for organizations managing multiple Kubernetes clusters across different infrastructures.


Master Kubernetes management with Rancher through our training programs. Contact us for customized training.


Edit page
Share this post on:

Previous Post
Red Hat OpenShift: Complete Enterprise Kubernetes Platform Guide
Next Post
Flux CD: Complete GitOps Toolkit for Kubernetes