OpenShift Data Foundation: Complete Storage Platform Guide
OpenShift Data Foundation (ODF), formerly OpenShift Container Storage (OCS), is Red Hat’s software-defined storage solution for OpenShift. Built on Ceph, Rook, and NooBaa, ODF provides unified block, file, and object storage with enterprise features like multi-cloud data management, disaster recovery, and seamless integration with OpenShift.
What is OpenShift Data Foundation?
ODF is a cloud-native persistent storage platform that runs on OpenShift, providing:
Key Features
- Unified Storage: Block (RBD), File (CephFS), Object (S3/Swift)
- Cloud-Native: Kubernetes-native via Rook operator
- Multi-Cloud Object Gateway: NooBaa for hybrid/multi-cloud
- Disaster Recovery: Metro DR, regional DR, backup/restore
- Data Services: Encryption, compression, deduplication
- Enterprise Support: Full Red Hat support and lifecycle management
- Self-Service: Dynamic provisioning with StorageClasses
- Performance Tiers: SSD, HDD, and NVMe support
- Monitoring: Integrated with OpenShift observability
- Security: Encryption at rest and in transit
Architecture
┌─────────────────────────────────────────────────────────────┐│ OpenShift Data Foundation ││ ││ ┌──────────────────────────────────────────────────────┐ ││ │ Storage Consumers │ ││ │ │ ││ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐│ ││ │ │ PVC │ │ PVC │ │ S3 │ │ MCG ││ ││ │ │ (Block) │ │ (File) │ │ Bucket │ │ Bucket ││ ││ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘│ ││ └───────┼────────────┼────────────┼────────────┼─────┘ ││ │ │ │ │ ││ ┌───────┴────────────┴────────────┴────────────┴─────┐ ││ │ Storage Abstraction Layer │ ││ │ │ ││ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ ││ │ │ RBD │ │ CephFS │ │ RGW │ │ ││ │ │ CSI │ │ CSI │ │ (S3) │ │ ││ │ └────────────┘ └────────────┘ └────────────┘ │ ││ └───────────────────────────────────────────────────┘ ││ ││ ┌───────────────────────────────────────────────────┐ ││ │ Rook-Ceph Operator │ ││ │ • Orchestrates Ceph cluster │ ││ │ • Manages lifecycle and upgrades │ ││ │ • Monitors health and performance │ ││ └───────────────────────────────────────────────────┘ ││ ││ ┌───────────────────────────────────────────────────┐ ││ │ Ceph Storage Cluster │ ││ │ │ ││ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ ││ │ │ MON │ │ MON │ │ MON │ │ ││ │ └─────────┘ └─────────┘ └─────────┘ │ ││ │ │ ││ │ ┌─────────┐ ┌─────────┐ │ ││ │ │ MGR │ │ MGR │ │ ││ │ └─────────┘ └─────────┘ │ ││ │ │ ││ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ ││ │ │OSD │ │OSD │ │OSD │ │OSD │ │OSD │ │OSD │ │ ││ │ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘ │ ││ │ (Local Storage on Worker Nodes) │ ││ └───────────────────────────────────────────────────┘ ││ ││ ┌───────────────────────────────────────────────────┐ ││ │ NooBaa Multi-Cloud Gateway │ ││ │ • S3-compatible object storage │ ││ │ • Hybrid/multi-cloud data management │ ││ │ • Data lifecycle policies │ ││ │ • Deduplication and compression │ ││ └───────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────┘Prerequisites
Hardware Requirements
Minimum (Testing):
- 3 worker nodes
- 16 vCPU per node
- 64 GB RAM per node
- 2 TB raw storage per node (SSD recommended)
Production:
- 3+ dedicated storage nodes (or labeled workers)
- 16+ vCPU per node
- 128+ GB RAM per node
- 4 TB+ NVMe/SSD per node
- 10 Gbps network
- Separate storage network recommended
Software Requirements
- OpenShift 4.10+
- Local Storage Operator (LSO) for bare metal
- OpenShift Monitoring Stack
Storage Topology
# Internal Mode (Most Common)- Uses worker node local storage- MON/MGR/OSD on worker nodes- Best for most deployments
# External Mode- Connects to existing Ceph cluster- OpenShift consumes storage- Best for large existing Ceph deployments
# Compact Mode- Run ODF on control plane nodes- 3-node clusters only- Not recommended for productionInstallation
Install ODF Operator
# Via OpenShift Console:# Operators → OperatorHub → Search "OpenShift Data Foundation"# Click Install → Follow wizard
# Via CLI:cat <<EOF | oc apply -f -apiVersion: v1kind: Namespacemetadata: name: openshift-storage---apiVersion: operators.coreos.com/v1kind: OperatorGroupmetadata: name: openshift-storage-operatorgroup namespace: openshift-storagespec: targetNamespaces: - openshift-storage---apiVersion: operators.coreos.com/v1alpha1kind: Subscriptionmetadata: name: odf-operator namespace: openshift-storagespec: channel: stable-4.14 name: odf-operator source: redhat-operators sourceNamespace: openshift-marketplace installPlanApproval: AutomaticEOF
# Verify operator installationoc get csv -n openshift-storageoc get pods -n openshift-storagePrepare Storage Devices (Bare Metal)
# Install Local Storage Operatorcat <<EOF | oc apply -f -apiVersion: v1kind: Namespacemetadata: name: openshift-local-storage---apiVersion: operators.coreos.com/v1kind: OperatorGroupmetadata: name: local-operator-group namespace: openshift-local-storagespec: targetNamespaces: - openshift-local-storage---apiVersion: operators.coreos.com/v1alpha1kind: Subscriptionmetadata: name: local-storage-operator namespace: openshift-local-storagespec: channel: stable name: local-storage-operator source: redhat-operators sourceNamespace: openshift-marketplaceEOF
# Label storage nodesoc label nodes worker-1 cluster.ocs.openshift.io/openshift-storage=''oc label nodes worker-2 cluster.ocs.openshift.io/openshift-storage=''oc label nodes worker-3 cluster.ocs.openshift.io/openshift-storage=''
# Create LocalVolumeDiscoverycat <<EOF | oc apply -f -apiVersion: local.storage.openshift.io/v1alpha1kind: LocalVolumeDiscoverymetadata: name: auto-discover-devices namespace: openshift-local-storagespec: nodeSelector: nodeSelectorTerms: - matchExpressions: - key: cluster.ocs.openshift.io/openshift-storage operator: ExistsEOF
# Wait and check discovered devicesoc get localvolumediscoveryresults -n openshift-local-storage
# Create LocalVolumeSetcat <<EOF | oc apply -f -apiVersion: local.storage.openshift.io/v1alpha1kind: LocalVolumeSetmetadata: name: local-block namespace: openshift-local-storagespec: nodeSelector: nodeSelectorTerms: - matchExpressions: - key: cluster.ocs.openshift.io/openshift-storage operator: Exists storageClassName: localblock volumeMode: Block fsType: ext4 maxDeviceCount: 10 deviceInclusionSpec: deviceTypes: - disk - part deviceMechanicalProperties: - NonRotational # SSD/NVMe only minSize: 100GiEOF
# Verify PVs createdoc get pv | grep localblockCreate Storage Cluster (Internal Mode)
# Via UI: Operators → Installed Operators → OpenShift Data Foundation# → Create StorageSystem
# Via YAML:apiVersion: ocs.openshift.io/v1kind: StorageClustermetadata: name: ocs-storagecluster namespace: openshift-storagespec: # Encryption encryption: enable: true kms: enable: false # Use vault for production
# Storage device sets storageDeviceSets: - name: ocs-deviceset-localblock count: 3 # One per storage node dataPVCTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Ti storageClassName: localblock volumeMode: Block placement: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cluster.ocs.openshift.io/openshift-storage operator: Exists tolerations: - effect: NoSchedule key: node.ocs.openshift.io/storage operator: Equal value: "true" portable: true replica: 3 resources: requests: cpu: "2" memory: 5Gi limits: cpu: "4" memory: 10Gi
# Resource requirements resources: mon: requests: cpu: "1" memory: 2Gi limits: cpu: "2" memory: 4Gi mgr: requests: cpu: "1" memory: 3Gi limits: cpu: "2" memory: 6Gi mds: requests: cpu: "3" memory: 8Gi limits: cpu: "6" memory: 16Gi rgw: requests: cpu: "1" memory: 4Gi limits: cpu: "2" memory: 8Gi noobaa-core: requests: cpu: "1" memory: 4Gi limits: cpu: "2" memory: 8Gi noobaa-db: requests: cpu: "1" memory: 4Gi limits: cpu: "2" memory: 8Gi
# Monitoring monDataDirHostPath: /var/lib/rook monPVCTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 50Gi storageClassName: localblock volumeMode: Filesystem
# Multi-Cloud Gateway multiCloudGateway: reconcileStrategy: manage dbStorageClassName: ocs-storagecluster-ceph-rbd endpoints: resources: requests: cpu: "1" memory: 2Gi limits: cpu: "2" memory: 4Gi
# Network network: provider: multus selectors: public: openshift-storage/ocs-public cluster: openshift-storage/ocs-cluster
# Version version: 4.14.0# Apply storage clusteroc apply -f storage-cluster.yaml
# Monitor deploymentwatch oc get pods -n openshift-storage
# Verify cluster health (wait 10-15 minutes)oc get storagecluster -n openshift-storageoc get cephcluster -n openshift-storageoc rsh -n openshift-storage $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name) ceph -sStorage Classes
Default Storage Classes
# List storage classesoc get sc
# Block storage (RBD)ocs-storagecluster-ceph-rbd # Standard RBDocs-storagecluster-ceph-rbd-thick # Thick provisioned
# File storage (CephFS)ocs-storagecluster-cephfs # Shared filesystem
# Object storage (RGW)openshift-storage.noobaa.io # NooBaa S3
# Set defaultoc patch storageclass ocs-storagecluster-ceph-rbd \ -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'Custom Storage Classes
# High-performance RBD with custom poolapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: ocs-rbd-ssdprovisioner: openshift-storage.rbd.csi.ceph.comparameters: clusterID: openshift-storage pool: ssd-pool imageFeatures: layering,exclusive-lock,object-map,fast-diff csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node csi.storage.k8s.io/node-stage-secret-namespace: openshift-storage csi.storage.k8s.io/fstype: ext4allowVolumeExpansion: truereclaimPolicy: DeletevolumeBindingMode: ImmediatemountOptions: - discard---# CephFS with quotaapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: ocs-cephfs-quotaprovisioner: openshift-storage.cephfs.csi.ceph.comparameters: clusterID: openshift-storage fsName: ocs-storagecluster-cephfilesystem pool: ocs-storagecluster-cephfilesystem-data0 mounter: kernel csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace: openshift-storageallowVolumeExpansion: truereclaimPolicy: DeleteUsing ODF Storage
Block Storage (RBD)
# PVC for databaseapiVersion: v1kind: PersistentVolumeClaimmetadata: name: postgresql-data namespace: productionspec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: ocs-storagecluster-ceph-rbd---# StatefulSet using RBDapiVersion: apps/v1kind: StatefulSetmetadata: name: postgresql namespace: productionspec: serviceName: postgresql replicas: 3 selector: matchLabels: app: postgresql template: metadata: labels: app: postgresql spec: containers: - name: postgresql image: postgres:15 ports: - containerPort: 5432 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: data spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: ocs-storagecluster-ceph-rbdFile Storage (CephFS)
# Shared storage for applicationsapiVersion: v1kind: PersistentVolumeClaimmetadata: name: shared-files namespace: productionspec: accessModes: - ReadWriteMany resources: requests: storage: 500Gi storageClassName: ocs-storagecluster-cephfs---# Deployment using shared storageapiVersion: apps/v1kind: Deploymentmetadata: name: web-app namespace: productionspec: replicas: 5 selector: matchLabels: app: web-app template: metadata: labels: app: web-app spec: containers: - name: nginx image: nginx:latest volumeMounts: - name: shared-content mountPath: /usr/share/nginx/html volumes: - name: shared-content persistentVolumeClaim: claimName: shared-filesObject Storage (S3)
# ObjectBucketClaimapiVersion: objectbucket.io/v1alpha1kind: ObjectBucketClaimmetadata: name: app-bucket namespace: productionspec: generateBucketName: app-bucket storageClassName: openshift-storage.noobaa.io---# Application using S3 bucketapiVersion: apps/v1kind: Deploymentmetadata: name: s3-app namespace: productionspec: replicas: 3 selector: matchLabels: app: s3-app template: metadata: labels: app: s3-app spec: containers: - name: app image: myapp:latest env: - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: app-bucket key: AWS_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: app-bucket key: AWS_SECRET_ACCESS_KEY - name: BUCKET_NAME valueFrom: configMapKeyRef: name: app-bucket key: BUCKET_NAME - name: BUCKET_HOST valueFrom: configMapKeyRef: name: app-bucket key: BUCKET_HOSTPerformance Tuning
Ceph Configuration
# Access Ceph toolsoc rsh -n openshift-storage \ $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)
# Inside tools pod:
# Increase PG count for better performanceceph osd pool set ocs-storagecluster-cephblockpool pg_num 256ceph osd pool set ocs-storagecluster-cephblockpool pgp_num 256
# Enable fast read for RBDceph osd pool set ocs-storagecluster-cephblockpool fast_read 1
# BlueStore cache tuningceph config set osd bluestore_cache_size_ssd 8589934592 # 8 GB
# OSD recovery tuningceph config set osd osd_recovery_max_active 3ceph config set osd osd_max_backfills 1
# Client cacheceph config set client rbd_cache trueceph config set client rbd_cache_size 67108864 # 64 MBNetwork Configuration
# Multus NetworkAttachmentDefinition for dedicated storage networkapiVersion: k8s.cni.cncf.io/v1kind: NetworkAttachmentDefinitionmetadata: name: ocs-public namespace: openshift-storagespec: config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "eth1", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "10.0.1.0/24" } }'---apiVersion: k8s.cni.cncf.io/v1kind: NetworkAttachmentDefinitionmetadata: name: ocs-cluster namespace: openshift-storagespec: config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "eth2", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "10.0.2.0/24" } }'Disaster Recovery
Regional DR with RBD Mirroring
# Enable RBD mirroring on primary clusteroc rsh -n openshift-storage $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)
# In tools pod:rbd mirror pool enable ocs-storagecluster-cephblockpool imagerbd mirror pool info ocs-storagecluster-cephblockpool
# Create bootstrap tokenrbd mirror pool peer bootstrap create ocs-storagecluster-cephblockpool > /tmp/bootstrap-token
# On secondary cluster:# Import bootstrap tokenrbd mirror pool peer bootstrap import ocs-storagecluster-cephblockpool /tmp/bootstrap-token
# Enable mirroring on specific imagesrbd mirror image enable ocs-storagecluster-cephblockpool/pvc-xxxxx snapshot
# Check mirroring statusrbd mirror image status ocs-storagecluster-cephblockpool/pvc-xxxxxMetro DR
# DRPolicy for Metro DRapiVersion: ramendr.openshift.io/v1alpha1kind: DRPolicymetadata: name: metro-drspec: drClusters: - primary-cluster - secondary-cluster schedulingInterval: 5m replicationClassSelector: {}---# DRPlacementControlapiVersion: ramendr.openshift.io/v1alpha1kind: DRPlacementControlmetadata: name: app-drpc namespace: productionspec: drPolicyRef: name: metro-dr placementRef: kind: Placement name: app-placement preferredCluster: primary-cluster pvcSelector: matchLabels: app: critical-appBackup with OADP
# Install OADP Operator and configureapiVersion: oadp.openshift.io/v1alpha1kind: DataProtectionApplicationmetadata: name: oadp-dpa namespace: openshift-adpspec: configuration: velero: defaultPlugins: - openshift - aws - csi restic: enable: true
backupLocations: - velero: provider: aws default: true objectStorage: bucket: odf-backups prefix: velero config: region: us-east-1 s3ForcePathStyle: "true" s3Url: https://s3.openshift-storage.svc credential: name: cloud-credentials key: cloud---# Backup scheduleapiVersion: velero.io/v1kind: Schedulemetadata: name: daily-backup namespace: openshift-adpspec: schedule: "0 2 * * *" template: includedNamespaces: - production snapshotVolumes: true ttl: 720h0m0sMonitoring and Troubleshooting
Access Ceph Dashboard
# Get dashboard routeoc get route -n openshift-storage rook-ceph-mgr-dashboard
# Get admin passwordoc get secret -n openshift-storage rook-ceph-dashboard-password \ -o jsonpath='{.data.password}' | base64 -d
# Access at: https://rook-ceph-mgr-dashboard-openshift-storage.apps.cluster.example.com# Username: adminHealth Checks
# Check ODF healthoc get storagecluster -n openshift-storageoc get cephcluster -n openshift-storage
# Ceph statusoc rsh -n openshift-storage $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)ceph -sceph health detail
# Check OSDsceph osd statusceph osd treeceph osd df
# Pool statusceph dfceph osd pool stats
# PG statusceph pg statCommon Issues
# OSD not startingoc logs -n openshift-storage -l app=rook-ceph-osd
# MON quorum issuesoc logs -n openshift-storage -l app=rook-ceph-mon
# Slow requestsceph daemon osd.0 dump_historic_ops
# Restart ODF componentsoc delete pod -n openshift-storage -l app=rook-ceph-monoc delete pod -n openshift-storage -l app=rook-ceph-mgrBest Practices
Planning
- Right-size storage nodes: 128+ GB RAM, 16+ cores
- Use NVMe/SSD: HDD not recommended for production
- Separate networks: Dedicated storage network for better performance
- Plan for growth: Start with 30-40% capacity
Operations
- Monitor capacity: Keep below 75% full
- Regular scrubbing: Schedule during low-usage times
- Test DR procedures: Regular failover testing
- Backup critical data: Use OADP or external backups
- Update regularly: Follow Red Hat update schedule
Security
- Enable encryption: At rest and in transit
- Use KMS: Vault or external KMS for production
- Network policies: Restrict access to storage network
- RBAC: Least privilege access
- Audit logs: Enable and monitor
Conclusion
OpenShift Data Foundation provides enterprise-grade software-defined storage natively integrated with OpenShift. Built on proven Ceph technology with Red Hat support, ODF delivers the performance, scalability, and reliability required for production Kubernetes workloads.
Master OpenShift Data Foundation and cloud-native storage with our training programs. Contact us for enterprise OpenShift training.