Add Flower alongside live environment Celery workers by ryanjjung · Pull Request #1617 · thunderbird/appointment

ryanjjung · 2026-04-08T17:51:45Z

This PR adds Celery workers to our live environments alongside Flower for worker visibility.

What's in the PR

Flower added to local dev configuration
Added explicit CONTAINER_ROLE variable value: "api" to indicate the regular backend as opposed to Celery, Flower, or something else.
Built new AutoscalingFargateCluster with Celery and Flower running in dev, stage, and prod.
Restructured all of our Pulumi configurations regarding task and container definitions to be more "DRY". This is because we now have three task definitions (which tend to be large/unwieldy) that are all 99% the same. So now we have variables that get reused across task/container defs that themselves get reused in our cluster configs.
Allow the Celery and Flower containers in dev and stage to talk to the Redis cluster.

What's Not in the PR

I have intentionally left the new prod Celery and Flower services scaled down to zero instances. This is because they will not work if brought online. And that is because we have to allow these containers access to the Neon DB PrivateLink security group (which is manual), and to the Redis cluster (which will be codified after the prod security groups have been created).

So there will be a small future PR coming after this is fully deployed.

Relevant Tickets

Co-authored-by: Davi Nakano <114549747+davinotdavid@users.noreply.github.com>

ryanjjung · 2026-04-13T18:45:27Z

-                    backend:
-                      image: "${{ steps.pulumi-tag-extract.outputs.pulumi_tag }}"
-          EOF
+          echo ".apmt_image: &APMT_IMAGE ${{ steps.pulumi-tag-extract.outputs.pulumi_tag }}" > newimage.yaml


This is simpler now due to the DRYing out of the task definitions.

ryanjjung · 2026-04-13T18:45:46Z

-            'urn:pulumi:prod::appointment::tb:fargate:FargateClusterWithLogging$aws:ecs/taskDefinition:TaskDefinition::appointment-prod-fargate-backend-taskdef' \
+            pulumi up -y --diff \
+            --target 'urn:pulumi:stage::appointment::tb:fargate:FargateClusterWithLogging$aws:ecs/taskDefinition:TaskDefinition::appointment-stage-fargate-backend-taskdef' \
+            --target 'urn:pulumi:stage::appointment::tb:fargate:AutoscalingFargateCluster::appointment-stage-afc-appointment' \


This makes sure that images get deployed to both clusters when we do a release.

ryanjjung · 2026-04-13T18:47:48Z

+### Special variables used throughout this file
+
+# Update this value to update all containers based on the thunderbird/appointment image
+.apmt_image: &APMT_IMAGE 768512802988.dkr.ecr.eu-central-1.amazonaws.com/thunderbird/appointment:7de6f16bdd309937caa186c8f5a269ea00118e5e


This is the variable referenced in the workflow changes. This gets used in the task definition below, and that task definition gets used for all three services. So you change this in one place and it goes out to All The Things.

ryanjjung · 2026-04-13T18:48:47Z


+  tb:cloudwatch:LogDestination:
+    appointment:
+      org_name: tb


This creates a privacy policy compliant CloudWatch Log Group called /tb/prod/appointment that these containers will now produce logs in.

ryanjjung · 2026-04-13T18:49:04Z

+            #   protocol: tcp
+            #   from_port: 6379
+            #   to_port: 6379
+            #   source_security_group_id: 


This is where I will add the security groups to grant access to Redis later on.

ryanjjung · 2026-04-13T18:50:38Z

      fargate_task_role_arns:
        - arn:aws:iam::768512802988:role/appointment-prod-fargate-backend
+        - arn:aws:iam::768512802988:role/appointment-prod-afc-appointment-celery
+        - arn:aws:iam::768512802988:role/appointment-prod-afc-appointment-flower


These represent new permissions needed for the CI process to deploy to the new cluster and tasks.

ryanjjung · 2026-04-13T18:51:20Z

@@ -1,3 +1,3 @@
-tb_pulumi @ git+https://github.com/thunderbird/pulumi.git@v0.0.16
+tb_pulumi @ git+https://github.com/thunderbird/pulumi.git@v0.0.18


This contains many fixes to both the LogDestination class and AutoscalingFargateCluster class that we need for these new resources to come out right. Ref: https://github.com/thunderbird/pulumi/blob/main/CHANGELOG.md

ryanjjung · 2026-04-13T18:55:31Z

    for rule in backend_cache_sg_ingress_rules:
-        rule['source_security_group_id'] = container_sgs.get('backend').resources.get('sg').id
+        if 'source_security_group_id' not in rule and 'cidr_blocks' not in rule:
+            rule['source_security_group_id'] = container_sgs.get('backend').resources.get('sg').id


Normally, the way we use these SGs in code is to automatically link up the Appointment backend container to Redis. If we want to link up any other source, we would need to specify that in code or config somewhere. Rather than bog this code down in lots of conditions based on expected strings in the config, I've changed this to allow us to specify a source in config, and to fall back on the Appointment backend container if no more explicit source is defined.

ryanjjung · 2026-04-13T18:57:10Z

+  celery-flower:
+    <<: *backend
+    ports:
+      - 5556:5555


Davi requested this port exposure change so that this does not conflict with a simultaneously running Flower container for a local dev instance of Accounts.

davinotdavid

Overall lgtm with a few comments / questions / double checking but I'd love to have an input from other infra folks as I am not as familiar with the devops intricacies!

davinotdavid · 2026-04-13T21:05:32Z

+                - *VAR_ZOOM_API_ENABLED
+                - *VAR_ZOOM_API_NEW_APP
+                - name: CONTAINER_ROLE
+                  value: celery


Same here, shouldn't this be beat / worker ? If so, I wonder if we need 2 sets of this configs, one for each?

I just fixed "celery" to "worker". As far as I'm aware, the beat container worker doesn't need to be deployed to live environments. That will be ultimately removed in favor of real tasks, right?

There's going to be a task in another PR that needs to be run periodically every week in production as well so perhaps we still need the beat container there too? Not sure if we need a separate container though as @Sancus added something called celery-redbeat to Accounts with a single container (ref thunderbird/thunderbird-accounts#696) so maybe that's also an option

I had not heard of redbeat, so I went and found its docs. It looks like you just configure your normal Celery container with this and a Redis key to use for its distributed lock. Then you define tasks as part of your Celery config. So it sounds to me like we wouldn't need a separate container to emit these events on a schedule.

davinotdavid · 2026-04-13T21:05:55Z

+                - *VAR_ZOOM_API_ENABLED
+                - *VAR_ZOOM_API_NEW_APP
+                - name: CONTAINER_ROLE
+                  value: celery


Same here, just double checking!

I fixed this in all three files just now.

aatchison

Go for it!

ryanjjung · 2026-04-15T14:28:48Z

Merged, will keep an eye on the workflows and follow through with the prod work now.

Install flower for local dev; add explicit 'api' CONTAINER_ROLE

f967bdb

ryanjjung self-assigned this Apr 8, 2026

ryanjjung added 3 commits April 8, 2026 13:12

Upgrade tb_pulumi, restructure dev task definitions

07b980f

Build new log destination and cluster in dev

c8a6941

Allow Celery and Flower containers to speak to Redis

df37314

davinotdavid reviewed Apr 8, 2026

View reviewed changes

Comment thread docker-compose.yml Outdated

ryanjjung and others added 9 commits April 10, 2026 09:51

Merge branch 'main' into celery-infra

7fae63b

Use alternate Flower port in local dev to avoid port collisions

8c93efa

Co-authored-by: Davi Nakano <114549747+davinotdavid@users.noreply.github.com>

Update stage task/container defs

fb419bf

Fix IP and cert in dev; build celery/flower cluster in stage

8b7f412

Add portMappings, scale flower down, update stage deployment flow

6900277

Update prod config for new cluster; update deploy workflows

a4bba21

Update stage image

f583dee

Tear down bastion

0b23beb

Enable unauthenticated Flower API

94a1d1b

ryanjjung commented Apr 13, 2026

View reviewed changes

ryanjjung marked this pull request as ready for review April 13, 2026 18:57

ryanjjung requested review from aatchison, davinotdavid and mzeier April 13, 2026 18:57

davinotdavid reviewed Apr 13, 2026

View reviewed changes

Fix environment name in deploy

0642c12

Fix celery role names

8b410b8

ryanjjung requested a review from davinotdavid April 13, 2026 22:26

Add local dev CONTAINER_ROLE for api

c78ef57

ryanjjung mentioned this pull request Apr 14, 2026

Add small doc to describe our actual live logging artifacts thunderbird/observability#51

Merged

aatchison approved these changes Apr 14, 2026

View reviewed changes

ryanjjung merged commit 2a8e57c into main Apr 15, 2026
8 checks passed

ryanjjung deleted the celery-infra branch April 15, 2026 14:28

ryanjjung mentioned this pull request Apr 15, 2026

Update stage image; fix boolean -> string in all envs #1622

Merged

		@@ -1,3 +1,3 @@
		tb_pulumi @ git+https://github.com/thunderbird/pulumi.git@v0.0.16
		tb_pulumi @ git+https://github.com/thunderbird/pulumi.git@v0.0.18

Conversation

ryanjjung commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's in the PR

What's Not in the PR

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanjjung Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davinotdavid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davinotdavid Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aatchison left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryanjjung commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ryanjjung commented Apr 8, 2026 •

edited

Loading

ryanjjung Apr 13, 2026 •

edited

Loading

davinotdavid Apr 14, 2026 •

edited

Loading