Skip to content

Temporal Operations

Operational guide for Temporal workflow orchestration in the fzymgc-house cluster.

Quick Reference

Property Value
Web UI https://temporal.fzymgc.house
Frontend Service temporal-frontend.temporal.svc.cluster.local:7233
Namespace temporal
Auth Method Forward-Auth (Authentik)
Workers Repo fzymgc-house/temporal-workers
Vault Path secret/fzymgc-house/cluster/temporal/*

Architecture

Two-Database Design

Temporal uses two separate PostgreSQL databases on the CNPG main cluster:

Database Purpose Schema Version
temporal Default store (workflow state, history) v1.18+
temporal_visibility Visibility store (workflow search/listing) v1.9+

This separation prevents schema version collisions—each store has its own schema_version table.

Components

Component Replicas Purpose
temporal-frontend 1 API gateway, client connections
temporal-history 1 Workflow state management
temporal-matching 1 Task queue routing
temporal-worker 1 Internal workflows (archival, replication)
temporal-web 1 Web UI
temporal-admintools 1 CLI tools (tctl, temporal)

Worker Deployment

Workers run via the Temporal Worker Controller, which manages TemporalWorkerDeployment CRDs for rainbow deployments (progressive rollout).

┌─────────────────────────────────────────────────────────────┐
│                    Temporal Server                          │
│  ┌──────────┐ ┌─────────┐ ┌──────────┐ ┌────────┐          │
│  │ Frontend │ │ History │ │ Matching │ │ Worker │          │
│  └────┬─────┘ └────┬────┘ └────┬─────┘ └────────┘          │
│       │            │           │                            │
│       └────────────┴───────────┘                            │
│                    │                                        │
│            ┌───────┴───────┐                                │
│            │   PostgreSQL  │                                │
│            │ temporal      │                                │
│            │ temporal_vis  │                                │
│            └───────────────┘                                │
└─────────────────────────────────────────────────────────────┘
        ┌────────────┼────────────┐
        ▼            ▼            ▼
   ┌─────────┐  ┌─────────┐  ┌─────────┐
   │ Worker  │  │ Worker  │  │ Worker  │
   │ (core)  │  │ (v2)    │  │ (v3)    │
   └─────────┘  └─────────┘  └─────────┘
       Rainbow Deployment (Worker Controller)

Common Operations

Access Admin Tools

# Interactive shell
kubectl exec -it -n temporal deploy/temporal-admintools -- bash

# Single command
kubectl exec -n temporal deploy/temporal-admintools -- tctl <command>

Namespace Management

# List namespaces
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns default namespace list

# Create namespace
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns default namespace register --namespace workflows

# Describe namespace
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns workflows namespace describe

Workflow Operations

# List workflows
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns workflows workflow list

# Execute workflow
kubectl exec -n temporal deploy/temporal-admintools -- \
  temporal workflow execute \
    --namespace workflows \
    --task-queue default \
    --type HelloWorkflow \
    --input '"World"'

# Get workflow history
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns workflows workflow showid <workflow-id>

# Terminate workflow
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns workflows workflow terminate --workflow_id <id> --reason "manual termination"

Task Queue Status

# Describe task queue (shows workers)
kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl --ns workflows taskqueue describe --taskqueue default

Check Cluster Health

kubectl exec -n temporal deploy/temporal-admintools -- \
  tctl cluster health

Expected output: temporal.api.workflowservice.v1.WorkflowService: SERVING

Database Operations

Check Schema Versions

# Default store
kubectl exec -n postgres main-13 -- \
  psql -U postgres -d temporal -c "SELECT * FROM schema_version;"

# Visibility store
kubectl exec -n postgres main-13 -- \
  psql -U postgres -d temporal_visibility -c "SELECT * FROM schema_version;"

Verify Tables

# Default store tables (should have ~39 tables)
kubectl exec -n postgres main-13 -- \
  psql -U postgres -d temporal -c "\dt" | wc -l

# Visibility store tables
kubectl exec -n postgres main-13 -- \
  psql -U postgres -d temporal_visibility -c "\dt"

Expected visibility tables:

  • executions_visibility
  • schema_update_history
  • schema_version

Troubleshooting

Visibility Store Errors

Error: ListWorkflowExecutions operation failed. Select failed: pq: relation "executions_visibility" does not exist

Cause: Schema migration didn't run for visibility store. Usually caused by:

  1. Single-database setup where default store schema version blocked visibility migration
  2. Schema job not connecting to correct database

Fix:

  1. Verify two-database architecture is deployed:

    kubectl get database -n postgres | grep temporal
    
    Expected: temporal and temporal-visibility

  2. Delete and re-sync schema job:

    kubectl delete job -n temporal -l app.kubernetes.io/component=schema
    argocd app sync temporal-server
    

  3. Verify visibility tables exist:

    kubectl exec -n postgres main-13 -- \
      psql -U postgres -d temporal_visibility -c "\dt"
    

Worker Not Processing Tasks

Symptoms: Workflows stuck in "Running" state, no progress

Checks:

  1. Worker pod running:

    kubectl get pods -n temporal -l app.kubernetes.io/name=temporal-worker
    

  2. Worker connected to task queue:

    kubectl exec -n temporal deploy/temporal-admintools -- \
      tctl --ns workflows taskqueue describe --taskqueue default
    

  3. Worker logs:

    kubectl logs -n temporal -l app.kubernetes.io/name=temporal-worker --tail=100
    

Server Pod CrashLooping

Check logs:

kubectl logs -n temporal deploy/temporal-frontend --tail=100
kubectl logs -n temporal deploy/temporal-history --tail=100

Common causes:

  • Database connection issues (check CNPG cluster health)
  • Secret not synced (check ExternalSecret status)
  • Schema not applied (check schema job completion)

Schema Job Stuck

Check job status:

kubectl get jobs -n temporal
kubectl describe job temporal-schema-1 -n temporal

Check init container logs:

kubectl logs -n temporal job/temporal-schema-1 -c setup-default-store
kubectl logs -n temporal job/temporal-schema-1 -c setup-visibility-store

If job is immutable and needs recreation:

kubectl delete job temporal-schema-1 -n temporal
argocd app sync temporal-server

Secrets

Vault Paths

Path Contents
fzymgc-house/cluster/postgres/users/main-temporal Database credentials
fzymgc-house/cluster/temporal/* Worker secrets

Kubernetes Secrets

Secret Namespace Source
temporal-db-secret temporal ExternalSecret → Vault
temporal-worker-secrets temporal ExternalSecret → Vault