Skip to content

Longhorn Storage Class HA Migration Plan

Overview

This document describes the migration to ensure all Longhorn storage classes are configured for high availability (HA) with proper replica counts and volume binding modes.

Storage Class Analysis

HA-Ready Storage Classes ✅

  • longhorn (default): 2 replicas, WaitForFirstConsumer
  • longhorn-retain: 2 replicas, WaitForFirstConsumer
  • postgres-storage: 2 replicas, WaitForFirstConsumer

Storage Classes Requiring Updates

1. longhorn-encrypted ⚠️

  • Current: 2 replicas, Immediate binding
  • Issue: Immediate binding prevents optimal pod scheduling
  • Fix: Change to WaitForFirstConsumer
  • Impact: 3 Vault PVCs use this class

2. longhorn-1replica-encrypted ❌

  • Current: 1 replica, Immediate binding
  • Issue: Single replica = no HA, cannot survive node failure
  • Fix: Change to 2 replicas, WaitForFirstConsumer
  • Impact: No PVCs currently use this class

3. longhorn-static ❌

  • Current: No replica count set, Immediate binding
  • Issue: Missing critical HA parameters
  • Action: Delete (orphaned resource, not in Git, no PVCs)

Changes Applied

Terraform Updates

File: tf/cluster-bootstrap/longhorn.tf

longhorn-encrypted (lines 80-106):

- volume_binding_mode    = "Immediate"
+ volume_binding_mode    = "WaitForFirstConsumer"

longhorn-1replica-encrypted (lines 108-134):

- volume_binding_mode    = "Immediate"
+ volume_binding_mode    = "WaitForFirstConsumer"

  parameters = {
-   numberOfReplicas = "1"
+   numberOfReplicas = "2"

Migration Steps

Phase 1: Apply Terraform Changes (Non-Disruptive)

StorageClass updates do NOT affect existing PVCs/PVs. Only new volumes will use the updated configuration.

cd tf/cluster-bootstrap
terraform plan -out=tfplan
terraform apply tfplan

Expected outcome: - longhorn-encrypted: volumeBindingMode updated to WaitForFirstConsumer - longhorn-1replica-encrypted: numberOfReplicas updated to "2", volumeBindingMode updated

Phase 2: Clean Up Orphaned Storage Class

kubectl --context fzymgc-house delete storageclass longhorn-static

Phase 3: Migrate Existing Vault PVCs (Optional)

The 3 Vault PVCs using longhorn-encrypted already have 2 replicas (correct). Only the binding mode has changed, which only affects new PVC creation.

No migration needed unless you want to recreate PVCs to use the new binding mode, which is unnecessary for these stateful pods.

If you do want to migrate for consistency:

  1. Backup Vault data first (critical!)
  2. Scale down Vault StatefulSet to 0
  3. Delete PVCs (data preserved in Longhorn volumes)
  4. Scale up StatefulSet (recreates PVCs with new binding mode)

Recommendation: Skip this phase. The binding mode change only matters for new volumes, and existing Vault PVCs are working correctly with 2 replicas.

Phase 4: Verification

# Verify storage class configuration
kubectl --context fzymgc-house get storageclass longhorn-encrypted -o yaml
kubectl --context fzymgc-house get storageclass longhorn-1replica-encrypted -o yaml

# Confirm volumeBindingMode is WaitForFirstConsumer
# Confirm numberOfReplicas is "2" for both

# Verify Vault PVCs still healthy
kubectl --context fzymgc-house get pvc -n vault
kubectl --context fzymgc-house get pod -n vault

# Check Longhorn replica status
kubectl --context fzymgc-house get volumes -n longhorn-system -o wide

Why These Changes Matter for HA

numberOfReplicas: "2"

  • HA Benefit: Data is replicated across 2 nodes
  • Failure Tolerance: Can survive single node failure
  • Migration: Pod can be rescheduled to another node and still access data

volumeBindingMode: WaitForFirstConsumer

  • HA Benefit: Volume is created where pod is scheduled, not randomly
  • Scheduler Optimization: Kubernetes considers volume topology when placing pods
  • Anti-Affinity: Works better with pod anti-affinity rules to spread replicas

Immediate vs WaitForFirstConsumer Example

Immediate (problematic): 1. User creates PVC 2. Longhorn creates volume on Node A and Node B (2 replicas) 3. Pod scheduled to Node C (far from volume replicas) 4. Pod must access volume over network

WaitForFirstConsumer (optimal): 1. User creates PVC (no volume created yet) 2. Pod scheduled to Node A 3. Longhorn creates volume with one replica on Node A (local) 4. Second replica on Node B (remote, for HA) 5. Pod accesses volume locally on Node A (faster)

Rollback Plan

If issues occur after applying Terraform changes:

cd tf/cluster-bootstrap

# Revert longhorn.tf to previous commit
git checkout HEAD~1 -- longhorn.tf

# Apply previous configuration
terraform plan -out=tfplan
terraform apply tfplan

Existing PVCs are not affected by StorageClass changes.

Timeline

  • Phase 1: 5 minutes (Terraform apply)
  • Phase 2: 1 minute (delete orphaned class)
  • Phase 3: Skip (not needed)
  • Phase 4: 5 minutes (verification)

Total: ~15 minutes

Success Criteria

  • ✅ All Longhorn storage classes have numberOfReplicas: "2"
  • ✅ All Longhorn storage classes use volumeBindingMode: WaitForFirstConsumer
  • ✅ Existing Vault PVCs remain healthy
  • ✅ No orphaned storage classes in cluster
  • ✅ New PVCs created with optimal HA configuration

Actual Execution Summary (2025-12-06)

Changes Applied Successfully ✅

Storage Classes Updated: 1. longhorn-encrypted: volumeBindingMode changed from Immediate → WaitForFirstConsumer (2 replicas maintained) 2. longhorn-1replica-encrypted: numberOfReplicas changed from 1 → 2, volumeBindingMode changed from Immediate → WaitForFirstConsumer 3. longhorn-static: Deleted (orphaned resource)

Method Used: - StorageClass fields are immutable in Kubernetes - Had to delete and recreate storage classes with new configuration - Applied via kubectl (not Terraform) due to Vault secret path issues

Verification Results: - ✅ All Longhorn storage classes now have 2 replicas - ✅ All use WaitForFirstConsumer binding mode - ✅ All 6 Vault PVCs remain Bound and healthy - ✅ All 4 Vault pods Running - ✅ Longhorn volumes confirmed with 2 replicas each

Terraform State Status ⚠️

Issue: Terraform state was out of sync after manual kubectl apply.

Resolution: - Removed storage classes from Terraform state: terraform state rm - Next Terraform apply will recreate resources in state - Cluster resources are correct and operational

Note: Terraform plan currently fails due to missing Vault secrets at expected paths: - fzymgc-house/data/cluster/argocd - fzymgc-house/data/infrastructure/pki/fzymgc-ica1-ca - fzymgc-house/data/cluster/longhorn/crypto-key - fzymgc-house/data/cluster/longhorn/cloudflare-r2

This is a separate issue from the storage class migration and should be addressed independently.

Next Steps for Terraform

When Vault secret paths are corrected: 1. Run terraform plan to verify configuration 2. Terraform will detect the storage classes exist and import them 3. State will be reconciled automatically

The cluster is correctly configured now - Terraform state sync is for future management only.