Skip to content

Add repository consolidation#1150

Open
pyasi wants to merge 6 commits into
buildkite:mainfrom
pyasi:add_repository_consolidation
Open

Add repository consolidation#1150
pyasi wants to merge 6 commits into
buildkite:mainfrom
pyasi:add_repository_consolidation

Conversation

@pyasi

@pyasi pyasi commented Jan 22, 2020

Copy link
Copy Markdown

Consolidating jobs using the same repository into single build directory

The idea here is that with large repositories, we waste a lot of space on our nodes when we have numerous pipelines. This is because the agent creates build dirs based on the pipeline-slug. We could save a lot of space and not need to limit the amount of pipelines we make by having all builds of the same repo use the same build directory

This would be really helpful for our org so I imagine other teams with large repositories could benefit as well

Things I'd like feedback on

  • Language: specifically around "Consolidate all builds using identical repositories into a single build directory" being the language convention to use for this functionality
  • Hashing: This approach ensures the dir is alphanumeric, but certainly obscures what the build dir is. Thoughts? If this is the right approach we could also move it into utils.
  • Protected Env: I currently have this as a protected ENV variable, I think this is the right approach so it must be configured at the agent level. Open to discussion though.

Next steps

  • If we get sign off on this, I'd of course like to add tests.

@pyasi pyasi requested review from keithpitt, lox and matthewd January 22, 2020 16:35
Comment thread bootstrap/bootstrap.go
@pyasi pyasi requested a review from pda February 4, 2020 13:54
@pyasi

pyasi commented Feb 12, 2020

Copy link
Copy Markdown
Author

@pda Would you mind reviewing this?

@lox

lox commented May 3, 2020

Copy link
Copy Markdown
Contributor

Sorry it's taken us so long to get back to you @pyasi! My main concern with this is that it limits concurrent access by lots of pipelines to the one checkout 🤔

Given that the git-mirrors experiment uses a reference clone of a single repository folder, does that mitigate some of the concerns, or are there other factors?

@pyasi

pyasi commented May 4, 2020

Copy link
Copy Markdown
Author

@lox all pipelines using the same repo will use the same checkout, correct, but since it's still PATH/TO/BUILDS_DIR/BUILDKITE_AGENT-1/repo wouldn't there never be concurrent workloads running on the same checkout dir? Unless the a single agent can run two jobs concurrently for multiple pipelines.

The git-mirrors experiment doesn't quite solve this because it still requires a "clone" of the repo for each pipeline. So let's say you have a 1 gig repo with 10 pipelines, that adds up quite fast.

We've solved this internally through a hook so no rush on this. But let me know if the above makes sense as a response to the concurrency issue you spotted.

@lox

lox commented May 4, 2020

Copy link
Copy Markdown
Contributor

The git-mirrors experiment doesn't quite solve this because it still requires a "clone" of the repo for each pipeline. So let's say you have a 1 gig repo with 10 pipelines, that adds up quite fast.

Interesting, you have a repo that is 1gb checkout out? (e.g not including the git history)

@pyasi

pyasi commented May 4, 2020

Copy link
Copy Markdown
Author

We used reference repos and would pull from them instead of from remote. This was to reduce traffic on GHE and speed up clone times, so it's not identical to the git-mirrors experiment. I vaguely remember looking into the experiment a while back and it wasn't solving our primary problem IIRC. I'd have to dig up the reason again though.

@pda pda self-assigned this May 11, 2020
Base automatically changed from master to main February 1, 2021 05:25
@pda pda removed request for lox, matthewd and pda December 5, 2023 22:56
@pda pda removed their assignment Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants