Oracle Container Engine for Kubernetes (OKE): Production Deployment Guide
Oracle Container Engine for Kubernetes (OKE) is Oracle Cloud’s managed Kubernetes service that provides enterprise-grade container orchestration with deep OCI integration. This comprehensive guide covers OKE architecture, deployment patterns, and production best practices.
Why Choose OKE?
Key Advantages
- Free Control Plane: No charges for Kubernetes control plane
- Bare Metal Workers: Option for bare metal nodes (unique among cloud providers)
- High Performance: RDMA networking for ultra-low latency
- OCI Integration: Native integration with OCI services
- Virtual Nodes: Serverless Kubernetes nodes
- Security: Pod Security Policies, Network Security Groups, encryption by default
OKE vs. Other Managed Kubernetes
| Feature | OKE | EKS | AKS | GKE |
|---|---|---|---|---|
| Control Plane Cost | Free | $0.10/hr | Free | Free (Autopilot) |
| Bare Metal Nodes | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Virtual Nodes | ✅ Yes | Fargate | ACI | Autopilot |
| RDMA Networking | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Kubernetes Version | Latest | Latest | Latest | Latest + GKE |
Architecture
OKE Cluster Components
┌────────────────────────────────────────────────────────────┐│ OCI Region ││ ││ ┌──────────────────────────────────────────────────────┐ ││ │ OKE Control Plane (Oracle Managed - Free) │ ││ │ │ ││ │ • API Server (Multi-AD) │ ││ │ • etcd (Highly Available) │ ││ │ • Controller Manager │ ││ │ • Scheduler │ ││ └─────────────────┬────────────────────────────────────┘ ││ │ ││ ┌──────────────────┴────────────────────────────────────┐ ││ │ VCN (Virtual Cloud Network) │ ││ │ │ ││ │ ┌────────────────────────────────────────────────┐ │ ││ │ │ Worker Node Pool 1 (VM.Standard.E4.Flex) │ │ ││ │ │ │ │ ││ │ │ ┌────────┐ ┌────────┐ ┌────────┐ │ │ ││ │ │ │ Node 1 │ │ Node 2 │ │ Node 3 │ │ │ ││ │ │ │ (AD-1) │ │ (AD-2) │ │ (AD-3) │ │ │ ││ │ │ │ │ │ │ │ │ │ │ ││ │ │ │ Pods │ │ Pods │ │ Pods │ │ │ ││ │ │ └────────┘ └────────┘ └────────┘ │ │ ││ │ └────────────────────────────────────────────────┘ │ ││ │ │ ││ │ ┌────────────────────────────────────────────────┐ │ ││ │ │ Worker Node Pool 2 (BM.GPU4.8) │ │ ││ │ │ (Bare Metal with GPUs) │ │ ││ │ └────────────────────────────────────────────────┘ │ ││ │ │ ││ │ ┌────────────────────────────────────────────────┐ │ ││ │ │ Virtual Node Pool (Serverless) │ │ ││ │ │ (Automatic scaling, no node management) │ │ ││ │ └────────────────────────────────────────────────┘ │ ││ └────────────────────────────────────────────────────┘ │└────────────────────────────────────────────────────────────┘Creating an OKE Cluster
Using OCI CLI
# Create VCN for OKE (quick create)oci ce cluster create-kubeconfig \ --cluster-id <cluster-ocid> \ --file $HOME/.kube/config \ --region us-ashburn-1 \ --token-version 2.0.0
# Create OKE clusteroci ce cluster create \ --compartment-id $COMPARTMENT_ID \ --name production-cluster \ --vcn-id $VCN_ID \ --kubernetes-version v1.28.2 \ --service-lb-subnet-ids '["'$LB_SUBNET_ID'"]' \ --endpoint-subnet-id $API_ENDPOINT_SUBNET_ID \ --endpoint-public-ip-enabled true \ --pods-cidr 10.244.0.0/16 \ --services-cidr 10.96.0.0/16 \ --cluster-pod-network-options '[{"cniType": "OCI_VCN_IP_NATIVE"}]'
# Create node pooloci ce node-pool create \ --cluster-id $CLUSTER_ID \ --compartment-id $COMPARTMENT_ID \ --name general-pool \ --node-shape VM.Standard.E4.Flex \ --node-shape-config '{"ocpus": 2, "memoryInGBs": 16}' \ --node-image-id $IMAGE_ID \ --size 3 \ --placement-configs '[{ "availabilityDomain": "IYVq:US-ASHBURN-AD-1", "subnetId": "'$WORKER_SUBNET_ID'", "faultDomains": ["FAULT-DOMAIN-1", "FAULT-DOMAIN-2", "FAULT-DOMAIN-3"] }]' \ --node-config-details '{ "size": 3, "placementConfigs": [...], "nsgIds": ["'$WORKER_NSG_ID'"], "isPvEncryptionInTransitEnabled": true }'Using Terraform
terraform { required_providers { oci = { source = "oracle/oci" version = "~> 5.0" } }}
provider "oci" { region = var.region tenancy_ocid = var.tenancy_ocid user_ocid = var.user_ocid fingerprint = var.fingerprint private_key_path = var.private_key_path}
# VCN for OKEresource "oci_core_vcn" "oke_vcn" { compartment_id = var.compartment_id display_name = "oke-vcn" cidr_blocks = ["10.0.0.0/16"] dns_label = "okevcn"}
# Internet Gatewayresource "oci_core_internet_gateway" "oke_ig" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-internet-gateway"}
# NAT Gatewayresource "oci_core_nat_gateway" "oke_nat" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-nat-gateway"}
# Service Gatewaydata "oci_core_services" "all_services" { filter { name = "name" values = ["All .* Services In Oracle Services Network"] regex = true }}
resource "oci_core_service_gateway" "oke_sg" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-service-gateway"
services { service_id = data.oci_core_services.all_services.services[0].id }}
# Subnetsresource "oci_core_subnet" "oke_api_endpoint_subnet" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id cidr_block = "10.0.0.0/28" display_name = "oke-api-endpoint-subnet" dns_label = "apiendpoint" prohibit_public_ip_on_vnic = false route_table_id = oci_core_route_table.oke_api_rt.id security_list_ids = [oci_core_security_list.oke_api_sl.id]}
resource "oci_core_subnet" "oke_lb_subnet" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id cidr_block = "10.0.1.0/24" display_name = "oke-lb-subnet" dns_label = "lbsubnet" prohibit_public_ip_on_vnic = false route_table_id = oci_core_route_table.oke_public_rt.id security_list_ids = [oci_core_security_list.oke_lb_sl.id]}
resource "oci_core_subnet" "oke_worker_subnet" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id cidr_block = "10.0.10.0/24" display_name = "oke-worker-subnet" dns_label = "workers" prohibit_public_ip_on_vnic = true route_table_id = oci_core_route_table.oke_private_rt.id security_list_ids = [oci_core_security_list.oke_worker_sl.id]}
# Route Tablesresource "oci_core_route_table" "oke_public_rt" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-public-rt"
route_rules { destination = "0.0.0.0/0" network_entity_id = oci_core_internet_gateway.oke_ig.id }}
resource "oci_core_route_table" "oke_private_rt" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-private-rt"
route_rules { destination = "0.0.0.0/0" network_entity_id = oci_core_nat_gateway.oke_nat.id }
route_rules { destination = data.oci_core_services.all_services.services[0].cidr_block destination_type = "SERVICE_CIDR_BLOCK" network_entity_id = oci_core_service_gateway.oke_sg.id }}
resource "oci_core_route_table" "oke_api_rt" { compartment_id = var.compartment_id vcn_id = oci_core_vcn.oke_vcn.id display_name = "oke-api-rt"
route_rules { destination = "0.0.0.0/0" network_entity_id = oci_core_internet_gateway.oke_ig.id }}
# OKE Clusterresource "oci_containerengine_cluster" "oke_cluster" { compartment_id = var.compartment_id kubernetes_version = "v1.28.2" name = "production-cluster" vcn_id = oci_core_vcn.oke_vcn.id
endpoint_config { is_public_ip_enabled = true subnet_id = oci_core_subnet.oke_api_endpoint_subnet.id nsg_ids = [oci_core_network_security_group.oke_api_nsg.id] }
options { service_lb_subnet_ids = [oci_core_subnet.oke_lb_subnet.id]
add_ons { is_kubernetes_dashboard_enabled = false is_tiller_enabled = false }
admission_controller_options { is_pod_security_policy_enabled = true }
kubernetes_network_config { pods_cidr = "10.244.0.0/16" services_cidr = "10.96.0.0/16" } }
cluster_pod_network_options { cni_type = "OCI_VCN_IP_NATIVE" }
freeform_tags = { "Environment" = "Production" "ManagedBy" = "Terraform" }}
# Node Pool - General Purposeresource "oci_containerengine_node_pool" "general_pool" { cluster_id = oci_containerengine_cluster.oke_cluster.id compartment_id = var.compartment_id kubernetes_version = "v1.28.2" name = "general-pool" node_shape = "VM.Standard.E4.Flex"
node_shape_config { ocpus = 2 memory_in_gbs = 16 }
node_config_details { size = 3
placement_configs { availability_domain = data.oci_identity_availability_domain.ad1.name subnet_id = oci_core_subnet.oke_worker_subnet.id fault_domains = ["FAULT-DOMAIN-1", "FAULT-DOMAIN-2", "FAULT-DOMAIN-3"] }
nsg_ids = [oci_core_network_security_group.oke_worker_nsg.id] is_pv_encryption_in_transit_enabled = true }
node_source_details { image_id = data.oci_core_images.oke_images.images[0].id source_type = "IMAGE" }
initial_node_labels { key = "workload-type" value = "general" }
ssh_public_key = file("~/.ssh/id_rsa.pub")}
# Node Pool - High Memoryresource "oci_containerengine_node_pool" "memory_pool" { cluster_id = oci_containerengine_cluster.oke_cluster.id compartment_id = var.compartment_id kubernetes_version = "v1.28.2" name = "memory-pool" node_shape = "VM.Standard.E4.Flex"
node_shape_config { ocpus = 4 memory_in_gbs = 64 }
node_config_details { size = 2
placement_configs { availability_domain = data.oci_identity_availability_domain.ad1.name subnet_id = oci_core_subnet.oke_worker_subnet.id }
nsg_ids = [oci_core_network_security_group.oke_worker_nsg.id] }
node_source_details { image_id = data.oci_core_images.oke_images.images[0].id source_type = "IMAGE" }
initial_node_labels { key = "workload-type" value = "memory-intensive" }
node_eviction_node_pool_settings { eviction_grace_duration = "PT1H" is_force_delete_after_grace_duration = true }
ssh_public_key = file("~/.ssh/id_rsa.pub")}
# Bare Metal Node Pool for High Performanceresource "oci_containerengine_node_pool" "bare_metal_pool" { cluster_id = oci_containerengine_cluster.oke_cluster.id compartment_id = var.compartment_id kubernetes_version = "v1.28.2" name = "bare-metal-pool" node_shape = "BM.Standard.E4.128"
node_config_details { size = 2
placement_configs { availability_domain = data.oci_identity_availability_domain.ad1.name subnet_id = oci_core_subnet.oke_worker_subnet.id }
nsg_ids = [oci_core_network_security_group.oke_worker_nsg.id] }
node_source_details { image_id = data.oci_core_images.oke_images.images[0].id source_type = "IMAGE" }
initial_node_labels { key = "workload-type" value = "bare-metal" }
node_metadata = { "user_data" = base64encode(file("${path.module}/cloud-init.yaml")) }
ssh_public_key = file("~/.ssh/id_rsa.pub")}
# Virtual Node Pool (Serverless)resource "oci_containerengine_virtual_node_pool" "virtual_pool" { cluster_id = oci_containerengine_cluster.oke_cluster.id compartment_id = var.compartment_id display_name = "virtual-pool"
placement_configurations { availability_domain = data.oci_identity_availability_domain.ad1.name subnet_id = oci_core_subnet.oke_worker_subnet.id fault_domain = ["FAULT-DOMAIN-1", "FAULT-DOMAIN-2", "FAULT-DOMAIN-3"] }
pod_configuration { shape = "Pod.Standard.E4.Flex" subnet_id = oci_core_subnet.oke_worker_subnet.id
nsg_ids = [oci_core_network_security_group.oke_worker_nsg.id] }
size = 0 # Auto-scales based on pod requests
initial_virtual_node_labels { key = "virtual-node" value = "true" }
taints { key = "virtual-node" value = "true" effect = "NoSchedule" }}Get Kubeconfig
# Generate kubeconfigoci ce cluster create-kubeconfig \ --cluster-id $CLUSTER_ID \ --file $HOME/.kube/config \ --region us-ashburn-1 \ --token-version 2.0.0 \ --kube-endpoint PUBLIC_ENDPOINT
# Verify accessexport KUBECONFIG=$HOME/.kube/configkubectl get nodeskubectl get pods --all-namespacesOCI-Specific Integrations
1. OCI Cloud Controller Manager
Automatically manages load balancers and block volumes:
# Service with OCI Load BalancerapiVersion: v1kind: Servicemetadata: name: web-app annotations: service.beta.kubernetes.io/oci-load-balancer-shape: "flexible" service.beta.kubernetes.io/oci-load-balancer-shape-flex-min: "10" service.beta.kubernetes.io/oci-load-balancer-shape-flex-max: "100" service.beta.kubernetes.io/oci-load-balancer-ssl-ports: "443" service.beta.kubernetes.io/oci-load-balancer-tls-secret: "web-app-tls" service.beta.kubernetes.io/oci-load-balancer-backend-protocol: "HTTP" service.beta.kubernetes.io/oci-load-balancer-health-check-retries: "3" service.beta.kubernetes.io/oci-load-balancer-health-check-interval: "10000"spec: type: LoadBalancer selector: app: web-app ports: - name: https port: 443 targetPort: 8080 - name: http port: 80 targetPort: 80802. OCI Block Volume CSI Driver
# StorageClass for Block VolumesapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: oci-bv-ultra-high-perfprovisioner: blockvolume.csi.oraclecloud.comparameters: attachment-type: "paravirtualized" vpusPerGB: "20" # Ultra High PerformancevolumeBindingMode: WaitForFirstConsumerallowVolumeExpansion: true---# PersistentVolumeClaimapiVersion: v1kind: PersistentVolumeClaimmetadata: name: database-storagespec: accessModes: - ReadWriteOnce storageClassName: oci-bv-ultra-high-perf resources: requests: storage: 500Gi---# StatefulSet using PVCapiVersion: apps/v1kind: StatefulSetmetadata: name: postgresspec: serviceName: postgres replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:15 ports: - containerPort: 5432 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: oci-bv-ultra-high-perf resources: requests: storage: 500Gi3. OCI File Storage (FSS) CSI Driver
# StorageClass for FSSapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: oci-fssprovisioner: fss.csi.oraclecloud.comparameters: availabilityDomain: "IYVq:US-ASHBURN-AD-1" mountTargetSubnetOcid: "ocid1.subnet.oc1.iad.unique_ID" exportPath: "/shared"---# PersistentVolumeClaim for shared storageapiVersion: v1kind: PersistentVolumeClaimmetadata: name: shared-storagespec: accessModes: - ReadWriteMany storageClassName: oci-fss resources: requests: storage: 1Ti---# Deployment with shared storageapiVersion: apps/v1kind: Deploymentmetadata: name: web-appspec: replicas: 5 selector: matchLabels: app: web-app template: metadata: labels: app: web-app spec: containers: - name: app image: my-app:latest volumeMounts: - name: shared-data mountPath: /app/shared volumes: - name: shared-data persistentVolumeClaim: claimName: shared-storage4. Using Virtual Nodes
# Deploy to virtual nodes for burst capacityapiVersion: apps/v1kind: Deploymentmetadata: name: batch-jobspec: replicas: 50 selector: matchLabels: app: batch-job template: metadata: labels: app: batch-job spec: nodeSelector: virtual-node: "true" tolerations: - key: virtual-node operator: Equal value: "true" effect: NoSchedule containers: - name: job image: batch-processor:latest resources: requests: cpu: "1" memory: 2Gi limits: cpu: "2" memory: 4GiMonitoring and Logging
OCI Logging Integration
# Fluent Bit DaemonSet for OCI LoggingapiVersion: v1kind: ConfigMapmetadata: name: fluent-bit-config namespace: kube-systemdata: fluent-bit.conf: | [SERVICE] Flush 5 Daemon off Log_Level info
[INPUT] Name tail Path /var/log/containers/*.log Parser docker Tag kube.* Refresh_Interval 5
[OUTPUT] Name oci Match * Region us-ashburn-1 LogGroupOCID ocid1.loggroup.oc1.iad.unique_ID LogOCID ocid1.log.oc1.iad.unique_ID AuthType instance_principal---apiVersion: apps/v1kind: DaemonSetmetadata: name: fluent-bit namespace: kube-systemspec: selector: matchLabels: app: fluent-bit template: metadata: labels: app: fluent-bit spec: serviceAccountName: fluent-bit containers: - name: fluent-bit image: fluent/fluent-bit:latest volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: config mountPath: /fluent-bit/etc/ volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: config configMap: name: fluent-bit-configSecurity Best Practices
Pod Security Policies
apiVersion: policy/v1beta1kind: PodSecurityPolicymetadata: name: restrictedspec: privileged: false allowPrivilegeEscalation: false requiredDropCapabilities: - ALL volumes: - 'configMap' - 'emptyDir' - 'projected' - 'secret' - 'downwardAPI' - 'persistentVolumeClaim' hostNetwork: false hostIPC: false hostPID: false runAsUser: rule: 'MustRunAsNonRoot' seLinux: rule: 'RunAsAny' supplementalGroups: rule: 'RunAsAny' fsGroup: rule: 'RunAsAny' readOnlyRootFilesystem: falseNetwork Security Groups
Already configured in Terraform. Pods inherit NSG rules from worker nodes.
Production Checklist
- Multi-AD node pools configured
- Virtual node pool for burst capacity
- OCI Block Volume CSI driver installed
- OCI File Storage CSI driver installed (if needed)
- Load balancer annotations configured
- Monitoring and logging enabled
- Pod Security Policies enforced
- Network Security Groups configured
- Resource quotas defined
- Backup strategy implemented
Conclusion
Oracle Container Engine for Kubernetes provides a robust, high-performance managed Kubernetes service with unique features like bare metal nodes and free control plane. Deep OCI integration and competitive pricing make OKE an excellent choice for containerized workloads on Oracle Cloud.
Master Oracle Cloud and Kubernetes with our comprehensive cloud training programs. Contact us for customized OKE training tailored to your team.