Windmill Migration Progress¶
Migration from Argo Events + Argo Workflows to Windmill for GitOps automation.
Issue Tracker: GitHub Issues #127-#150
Overview¶
Migrating Terraform GitOps workflows from Argo Events/Workflows to Windmill with: - GitHub webhooks directly to Windmill - Flows defined as code in repository - GitHub Actions with wmill sync for deployment - Self-hosted GitHub Actions runner (actions-runner-controller) - Storj S3 for workflow storage - Windmill approval steps with Discord notifications - Concurrency control for Terraform applies
Migration Phases¶
Phase 0: Disable Argo Events and Workflows ✅ COMPLETE¶
Status: Completed 2025-12-07
Issue: #127
Completed Tasks:
- ✅ Disabled automated syncing in ArgoCD applications
- Modified argocd/cluster-app/templates/argo-events.yaml
- Modified argocd/cluster-app/templates/argo-workflows.yaml
- ✅ Merged PR #151 to main branch
- ✅ Synced cluster-app to apply changes
- ✅ Scaled all Argo Events deployments to 0 replicas
- ✅ Scaled all Argo Workflows deployments to 0 replicas
Verification:
# All deployments scaled to zero
kubectl --context fzymgc-house get deployment -n argo-events
kubectl --context fzymgc-house get deployment -n argo-workflows
# No workflow pods running
kubectl --context fzymgc-house get pods -n argo-workflows
Impact: GitHub webhooks will no longer trigger Terraform workflows. Manual Terraform operations required until Windmill flows are operational.
Phase 1: Set up Infrastructure ✅ COMPLETE¶
Status: Completed 2025-12-07
Issues: #128, #129, #130, #131
Completed Tasks:
- ✅ Created workspace terraform-gitops (#128)
- Workspace token stored in Vault
- Workspace synced with wmill sync
- ✅ Created actions-runner-controller deployment (#129)
- ArgoCD app: argocd/cluster-app/templates/actions-runner-controller.yaml
- Runner config: argocd/app-configs/actions-runner-controller/
- GitHub token stored in Vault
- Documentation: docs/github-token-setup.md
- ✅ Configured Storj S3 storage (#130)
- Resource: windmill/u/admin/terraform_s3_storage.resource.yaml
- Documentation: docs/windmill-s3-setup.md
- ✅ Configured Discord bot integration (#131)
- Resource: windmill/u/admin/terraform_discord_bot.resource.yaml
- Documentation: docs/windmill-discord-bot-setup.md
- Bot credentials stored in Vault
Workspace Resources Created:
- u/admin/github_token - GitHub repository access
- u/admin/terraform_discord_bot - Discord notifications with interactive buttons
- u/admin/terraform_s3_storage - S3 backend for Terraform state
Documentation:
- docs/windmill-sync-setup.md - Complete sync workflow guide
- docs/github-token-setup.md - GitHub PAT creation guide
- docs/windmill-s3-setup.md - S3 storage configuration
- docs/windmill-discord-bot-setup.md - Discord bot setup
- scripts/setup-windmill-sync.sh - Automated workspace sync script
Phase 2: Develop Windmill Flows ✅ COMPLETE¶
Status: Completed 2025-12-07
Issues: #132, #133, #134, #135, #136
Completed Tasks:
- ✅ Created windmill/ directory structure (#132)
- wmill.yaml with Python3 default runtime
- variables.json with all secret definitions
- f/terraform/ for flows and scripts
- u/admin/ for resources
- ✅ Wrote reusable scripts (#133)
- git_clone.py - Clone repository with GitHub token
- terraform_init.py - Initialize Terraform with S3 backend
- terraform_plan.py - Run plan and parse output
- terraform_apply.py - Apply Terraform changes
- notify_approval.py - Discord approval notifications
- notify_status.py - Discord status notifications
- ✅ Created Terraform deployment flow (#134)
- deploy_terraform.flow - Generic parameterized flow for all modules
- Includes: Clone → Init → Plan → Approval → Apply → Notify
- Skip logic for no-change plans
- Error handling with failure notifications
- ✅ Configured Windmill resources (#135)
- Discord bot configuration
- S3 storage configuration
- GitHub token configuration
- ✅ Synced to Windmill (#136)
- All scripts pushed successfully
- All resources created
- Flow deployed and ready to test
Files Created:
- Scripts: windmill/f/terraform/*.py (6 scripts)
- Flow: windmill/f/terraform/deploy_terraform.flow/flow.yaml
- Resources: windmill/u/admin/*.resource.yaml (3 resources)
- Variables: windmill/variables.json
- Configuration: windmill/wmill.yaml
Phase 3: GitHub Integration ✅ COMPLETE¶
Status: Completed 2025-12-08
Issues: #135 (closed - alternative approach), #138 (closed - alternative approach)
Completed Tasks:
- ✅ Created GitHub Actions workflows for Windmill sync
- windmill-deploy-prod.yaml - Deploys to production on PR merge
- sync-windmill-secrets.yaml - Syncs secrets from Vault
- sync-main-to-windmill-staging.yaml - Syncs main to staging branch
- ✅ Integrated Vault secret sync into deploy workflow
- Secrets synced from Vault AFTER code deployment
- Uses AppRole authentication for secure access
Note: Native GitHub webhooks to Windmill were NOT implemented. Instead, GitHub Actions workflows trigger deployments, which provides better control and observability.
Phase 4: Testing ✅ COMPLETE¶
Status: Completed 2025-12-08
Issues: #139, #140, #141, #142 (all closed)
Completed Tasks: - ✅ Manual flow execution tests (#139) - Flows operational in staging/prod - ✅ GitHub Actions trigger tests (#140) - PR merges trigger deployments - ✅ Concurrency control verification (#141) - Windmill handles concurrency - ✅ Error handling validation (#142) - Discord notifications working
Phase 5: Cleanup ✅ COMPLETE¶
Status: Completed 2025-12-09
Issues: #143, #144, #145, #146, #147 (all closed)
Completed Tasks: - ✅ Remove Argo Events manifests (#143) - PR #234 - ✅ Remove Argo Workflows manifests (#144) - PR #234 - ✅ Clean up Argo secrets (#145) - Deleted with namespaces - ✅ Remove Argo RBAC resources (#146) - Deleted with namespaces - ✅ Update documentation (#147) - PR #234
Removed in PR #234:
- argocd/cluster-app/templates/argo-events.yaml
- argocd/cluster-app/templates/argo-workflows.yaml
- argocd/cluster-app/templates/authentik-config.yaml
- argocd/cluster-app/templates/grafana-config.yaml
- argocd/cluster-app/templates/vault-config.yaml
- argocd/app-configs/argo-events/ (entire directory)
- argocd/app-configs/argo-workflows/ (entire directory)
- argocd/app-configs/authentik-config/ (entire directory - Argo-only)
- argocd/app-configs/grafana-config/ (entire directory - Argo-only)
- argocd/app-configs/vault-config/ (entire directory - Argo-only)
Cluster Cleanup:
# Verified namespaces deleted
kubectl --context fzymgc-house get namespace | grep -E "argo-events|argo-workflows"
# (returns empty - namespaces fully removed)
Phase 6: Monitoring and Optimization ⏳ PENDING¶
Status: Not Started
Issues: #148, #149, #150
Tasks: - [ ] Create Grafana dashboards (#148) - [ ] Implement workflow result caching (#149) - [ ] Set up alerting (#150)
Current State¶
Active Components¶
- ✅ Windmill (deployed, 3 worker groups, staging + prod workspaces)
- ✅ PostgreSQL (Windmill database)
- ✅ Redis (Windmill cache)
- ✅ GitHub Actions Runner Controller (ARC) with custom image
- ✅ Vault AppRole for CI/CD authentication
Operational Workflows¶
- ✅ Vault secret sync from GitHub Actions
- ✅ Windmill code sync via
wmill sync push - ✅ Production deployment on PR merge with
windmilllabel - ✅ Discord notifications for approvals and status
Terraform GitOps Trigger¶
The terraform-deploy.yml GitHub Actions workflow triggers Windmill flows when Terraform modules change:
- Trigger: Push to
mainwith changes in any supportedtf/*module - Manual: workflow_dispatch with module selector
- Flow:
f/terraform/deploy_terraform(generic parameterized flow) - Runner:
fzymgc-house-cluster-runners(self-hosted for Windmill API access)
Removed Components¶
- ✅ Argo Events (fully removed - PR #234)
- ✅ Argo Workflows (fully removed - PR #234)
- ✅ EventBus (removed with argo-events namespace)
Supported Terraform Modules¶
tf/vault- Vault policies and configurationtf/grafana- Grafana dashboards and data sourcestf/authentik- Authentik applications and groupstf/cloudflare- Cloudflare DNS and tunnel configurationtf/core-services- Core services configuration
Excluded:
- tf/teleport - Empty directory, no Terraform files
- tf/cluster-bootstrap - Chicken-egg: deploys ArgoCD and Windmill itself, auto-triggering could break deployment infrastructure mid-apply
Rollback Plan¶
Not applicable - per user decision, Argo Events and Workflows are being removed regardless of Windmill migration success.
References¶
- Migration Plan: GitHub Issues #127-#150
- Windmill Deployment:
argocd/cluster-app/templates/windmill.yaml - Windmill Flows:
windmill/f/terraform/ - GitHub Actions Workflows:
.github/workflows/windmill-deploy-prod.yaml.github/workflows/sync-windmill-secrets.yaml.github/workflows/sync-main-to-windmill-staging.yaml
Last Updated: 2025-12-14
Phase Completion Summary¶
- ✅ Phase 0: Argo Events/Workflows disabled
- ✅ Phase 1: Infrastructure configured (workspace, runner, S3, Discord)
- ✅ Phase 2: Windmill flows and scripts developed
- ✅ Phase 3: GitHub Integration via Actions workflows
- ✅ Phase 4: Testing completed
- ✅ Phase 5: Cleanup completed (PR #234)
- ⏳ Phase 6: Monitoring
Next Steps: Begin Phase 6 - create Grafana dashboards, configure caching, set up alerting