feat(chart): opt-in Argo Rollouts (blue-green) + pre-deploy migration Job#3653
feat(chart): opt-in Argo Rollouts (blue-green) + pre-deploy migration Job#3653nicacioliveira wants to merge 1 commit into
Conversation
… Job
Adds two opt-in flags to the studio Helm chart so consumers can switch from
the default Deployment to an Argo Rollouts Rollout with blue-green strategy,
and move DB migrations out of pod startup into a dedicated pre-sync Job.
Both default to off — existing installs keep the exact same Deployment with
the on-startup migration path. No selector / label changes, no PVC churn.
## What's new
- `argoRollouts.enabled` — when true, render a `Rollout` (argoproj.io/v1alpha1)
instead of the `Deployment`. Pod template is shared via the new
`chart-deco-studio.podTemplate` helper so the two workload kinds describe an
identical pod surface. Supports `blueGreen` (default) and `canary` strategies.
- `migrationJob.enabled` — when true, render a Job that runs
`bun run --cwd=apps/mesh migrate` ONCE before pods start. Carries BOTH
`helm.sh/hook: pre-install,pre-upgrade` and
`argocd.argoproj.io/hook: PreSync` annotations so it sequences correctly
whether installed via `helm upgrade` directly or synced by ArgoCD. The
runtime pod command gets `--skip-migrations` appended (the studio CLI
already exposes this flag — see `apps/mesh/src/cli.ts`), eliminating the
race between N replicas migrating concurrently and giving a clear
pre-deploy gate: migration Job fails → release aborted.
## New / modified files
- `templates/_pod-template.tpl` (new) — shared pod template helper + the
`podCommand` helper that appends `--skip-migrations` when migrationJob is on.
- `templates/deployment.yaml` — now wraps in `{{- if not argoRollouts.enabled }}`
and references the helper; lifts the entire `spec.template` body out.
- `templates/rollout.yaml` (new) — gated on `argoRollouts.enabled`, mirrors
the Deployment via the same helper, picks blueGreen or canary based on
values. Mutual-exclusion `fail` for both-on configs.
- `templates/migration-job.yaml` (new) — gated on `migrationJob.enabled`.
Sync-wave -1, dual hooks, bounded backoffLimit/activeDeadlineSeconds/TTL.
- `templates/service-preview.yaml` (new) — rendered only for blue-green;
Argo Rollouts manages its selector to point at the preview ReplicaSet.
- `values.yaml` — adds `argoRollouts` and `migrationJob` blocks; both off
by default, defaults preserve current behavior.
## Why opt-in
The chart is open-source and not everyone has the argo-rollouts controller
installed. Defaulting to Deployment keeps zero requirements on the consumer's
cluster. Internal CD (deco-apps-cd) flips both flags on for the deco-studio /
deco-studio-stg releases — that's a separate change.
## Migration discipline note
Blue-green amplifies the schema/code overlap window. The Job moves migrations
to a single execution point BEFORE the new ReplicaSet probes, but it does NOT
make destructive DDL safe — the old (blue) ReplicaSet still serves traffic
during the overlap window with the migrated schema. Destructive changes
(DROP/RENAME/type changes) still require expand-contract discipline at the
migration code level. This is independent of the chart and is being handled
team-side as a code/review practice.
## Verification
- `helm lint deploy/helm/studio` passes
- `helm template deco-studio deploy/helm/studio` (default) renders identical
workload surface to before — Deployment with `bun run deco --no-local-mode`,
no Rollout, no preview Service, no migration Job
- `helm template ... --set argoRollouts.enabled=true --set migrationJob.enabled=true`
renders Rollout with blueGreen, preview Service, migration Job with the
PreSync hooks, and `--skip-migrations` appended to the pod command
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
5 issues found across 6 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="deploy/helm/studio/templates/rollout.yaml">
<violation number="1" location="deploy/helm/studio/templates/rollout.yaml:69">
P2: Strategy selection silently falls back to blue-green when both strategy flags are false, producing inconsistent Rollout/Service manifests.</violation>
</file>
<file name="deploy/helm/studio/templates/migration-job.yaml">
<violation number="1" location="deploy/helm/studio/templates/migration-job.yaml:33">
P1: The migration hook runs before its ConfigMap/Secret dependencies exist, causing first install/sync failure when migrationJob is enabled.</violation>
</file>
<file name="deploy/helm/studio/templates/_pod-template.tpl">
<violation number="1" location="deploy/helm/studio/templates/_pod-template.tpl:23">
P2: Using a truthy check for `terminationGracePeriodSeconds` drops explicit `0` values, so the configured value may be ignored.</violation>
<violation number="2" location="deploy/helm/studio/templates/_pod-template.tpl:217">
P2: `--skip-migrations` is appended to any custom `image.command`, which can break pods when the command is not the `deco` CLI.</violation>
</file>
<file name="deploy/helm/studio/templates/deployment.yaml">
<violation number="1" location="deploy/helm/studio/templates/deployment.yaml:1">
P1: Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| labels: | ||
| {{- include "chart-deco-studio.labels" . | nindent 4 }} | ||
| annotations: | ||
| "helm.sh/hook": pre-install,pre-upgrade |
There was a problem hiding this comment.
P1: The migration hook runs before its ConfigMap/Secret dependencies exist, causing first install/sync failure when migrationJob is enabled.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/migration-job.yaml, line 33:
<comment>The migration hook runs before its ConfigMap/Secret dependencies exist, causing first install/sync failure when migrationJob is enabled.</comment>
<file context>
@@ -0,0 +1,105 @@
+ labels:
+ {{- include "chart-deco-studio.labels" . | nindent 4 }}
+ annotations:
+ "helm.sh/hook": pre-install,pre-upgrade
+ "helm.sh/hook-weight": "-1"
+ "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
</file context>
| @@ -1,3 +1,4 @@ | |||
| {{- if not (and .Values.argoRollouts .Values.argoRollouts.enabled) }} | |||
There was a problem hiding this comment.
P1: Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/deployment.yaml, line 1:
<comment>Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.</comment>
<file context>
@@ -1,3 +1,4 @@
+{{- if not (and .Values.argoRollouts .Values.argoRollouts.enabled) }}
apiVersion: apps/v1
kind: Deployment
</file context>
| trafficRouting: | ||
| {{- toYaml . | nindent 8 }} | ||
| {{- end }} | ||
| {{- else }} |
There was a problem hiding this comment.
P2: Strategy selection silently falls back to blue-green when both strategy flags are false, producing inconsistent Rollout/Service manifests.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/rollout.yaml, line 69:
<comment>Strategy selection silently falls back to blue-green when both strategy flags are false, producing inconsistent Rollout/Service manifests.</comment>
<file context>
@@ -0,0 +1,92 @@
+ trafficRouting:
+ {{- toYaml . | nindent 8 }}
+ {{- end }}
+ {{- else }}
+ blueGreen:
+ activeService: {{ default (include "chart-deco-studio.fullname" .) $bg.activeServiceName }}
</file context>
| {{- toYaml . | nindent 4 }} | ||
| {{- end }} | ||
| spec: | ||
| {{- if .Values.terminationGracePeriodSeconds }} |
There was a problem hiding this comment.
P2: Using a truthy check for terminationGracePeriodSeconds drops explicit 0 values, so the configured value may be ignored.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/_pod-template.tpl, line 23:
<comment>Using a truthy check for `terminationGracePeriodSeconds` drops explicit `0` values, so the configured value may be ignored.</comment>
<file context>
@@ -0,0 +1,220 @@
+ {{- toYaml . | nindent 4 }}
+ {{- end }}
+spec:
+ {{- if .Values.terminationGracePeriodSeconds }}
+ terminationGracePeriodSeconds: {{ .Values.terminationGracePeriodSeconds }}
+ {{- end }}
</file context>
| {{- define "chart-deco-studio.podCommand" -}} | ||
| {{- $cmd := default (list "bun" "run" "deco" "--no-local-mode") .Values.image.command -}} | ||
| {{- if and .Values.migrationJob .Values.migrationJob.enabled -}} | ||
| {{- $cmd = append $cmd "--skip-migrations" -}} |
There was a problem hiding this comment.
P2: --skip-migrations is appended to any custom image.command, which can break pods when the command is not the deco CLI.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/_pod-template.tpl, line 217:
<comment>`--skip-migrations` is appended to any custom `image.command`, which can break pods when the command is not the `deco` CLI.</comment>
<file context>
@@ -0,0 +1,220 @@
+{{- define "chart-deco-studio.podCommand" -}}
+{{- $cmd := default (list "bun" "run" "deco" "--no-local-mode") .Values.image.command -}}
+{{- if and .Values.migrationJob .Values.migrationJob.enabled -}}
+{{- $cmd = append $cmd "--skip-migrations" -}}
+{{- end -}}
+{{- toYaml $cmd -}}
</file context>
Summary
Two opt-in flags in the studio Helm chart so consumers can switch from the default Deployment to an Argo Rollouts Rollout (blue-green or canary), and move DB migrations out of pod startup into a dedicated pre-sync Job.
Both default to off — existing installs keep the exact same Deployment with the on-startup migration path. No selector / label changes, no PVC churn.
What's new
`argoRollouts.enabled` — render a `Rollout` (argoproj.io/v1alpha1) instead of `Deployment`. Pod template is shared via the new `chart-deco-studio.podTemplate` helper so the two workload kinds describe an identical pod surface. Supports `blueGreen` (default) and `canary` strategies.
`migrationJob.enabled` — render a Job that runs `bun run --cwd=apps/mesh migrate` ONCE before pods start. Carries BOTH `helm.sh/hook: pre-install,pre-upgrade` and `argocd.argoproj.io/hook: PreSync` annotations so it sequences correctly whether installed via `helm upgrade` directly or synced by ArgoCD. The runtime pod command gets `--skip-migrations` appended (the studio CLI already exposes this flag — see `apps/mesh/src/cli.ts`), eliminating the race between N replicas migrating concurrently and giving a clear pre-deploy gate: migration Job fails → release aborted.
Files
Why opt-in
The chart is open-source and not every consumer has the argo-rollouts controller installed. Defaulting to Deployment keeps zero requirements on the consumer's cluster. Internal CD (deco-apps-cd) flips both flags on for the deco-studio / deco-studio-stg releases — that's a separate change.
Important: migration discipline note
Blue-green amplifies the schema/code overlap window. The Job moves migrations to a single execution point BEFORE the new ReplicaSet probes, but it does NOT make destructive DDL safe — the old (blue) ReplicaSet still serves traffic during the overlap window with the migrated schema. Destructive changes (DROP/RENAME/type changes) still require expand-contract discipline at the migration code level. This is independent of the chart and is being handled team-side as a code/review practice.
Test plan
Summary by cubic
Adds opt-in Argo Rollouts support (blue-green/canary) and a pre-deploy migration Job to the
studioHelm chart. Both are off by default, so existing installs keep the same Deployment and on-startup migration behavior.New Features
argoRollouts.enabled: renders aRollout(argoproj.io/v1alpha1) instead of aDeployment. Blue-green by default with active/preview Services; canary supported; mutual-exclusion check if both are enabled. Shares the exact pod surface viachart-deco-studio.podTemplate.migrationJob.enabled: runsbun run --cwd=apps/mesh migrateonce as Helm pre-install/upgrade and ArgoCD PreSync hook. Runtime pod command appends--skip-migrationsto avoid N-replica races.Migration
argoRollouts.enabled=true(requiresargo-rolloutscontroller).migrationJob.enabled=trueto run migrations before pods start.Written for commit b36722a. Summary will update on new commits.