Add AGENTS.md / CLAUDE.md#23275
Conversation
|
Are these all things things you have had to tell a LLM? That is things it could not figure out from the existing context? To be concrete, I use Claude Code with Opus and don't recall having to tell it to |
Also using claude with opus and yeah, without the |
|
Interesting, I've been experimenting with Composer2 and haven't had to tell it any of this stuff. I just told it to first look at the actions to figure out what to do and sent it off to make a bunch of experimental, mid, changes. |
|
Also, you should strip out the CLAUDE.md from your other PR so it's not conflated. |
|
I supposed it's not as extensive as what I've got here but the bazel repo also has an AGENTS.md with some details on how to use bazel so I swear I'm not crazy. The additional context could also be potentially helpful for poor souls who try to make a go at making changes with less powerful models. Happy to amend or remove anything from the doc though. |
|
To be clear, I don't think you are crazy! I just have no idea how to test this sort of thing and 'best practices' are in wild flux. At DAYJOB we have a wide spread between "LLMs can't work with pants unless we repeat |
| ) | ||
| ``` | ||
|
|
||
| All BUILD files must have the copyright header: |
There was a problem hiding this comment.
I have seen LLMs use the literal year (2024 in this case) from these kinds of instructions instead of updating it to the current year. Maybe a nudge to update the year will teach them to be sensible.
| | Run a single test function | `pants test path/to/file_test.py -- -k test_function_name` | | ||
| | Debug a test (interactive) | `pants test --debug path/to/file_test.py` | | ||
| | Debug with specific test | `pants test --debug path/to/file_test.py -- -k test_name` | | ||
| | Lint changed files | `pants --changed-since=HEAD lint` | |
There was a problem hiding this comment.
You might want to --changed-since=main ? If you're stacking commits and you don't run this on every commit.
|
|
||
| ```bash | ||
| # Run tests matching a pattern | ||
| pants test path/to/file_test.py -- -k "test_foo or test_bar" |
There was a problem hiding this comment.
You can point out that everything after -- is passed through to pytest, and tell the LLM to go learn the pytest cli?
There was a problem hiding this comment.
I think it's worth mentioning the passthrough behavior, so that the LLM doesn't go off on a tangent trying to learn those options from Pants when they are actually pytest options.
| pants test --output=all path/to/file_test.py | ||
|
|
||
| # Force re-run (skip cache) | ||
| pants --no-local-cache test path/to/file_test.py |
There was a problem hiding this comment.
Specifically for tests you can test --force which forces the test to rerun but allows everything up to that point to use the cache.
|
|
||
| ### File headers | ||
|
|
||
| All source files must have the copyright header: |
There was a problem hiding this comment.
Did we not already say this above?
|
|
||
| ### Test function naming | ||
|
|
||
| - Test files: `*_test.py` (NOT `test_*.py`) |
| ### Integration vs unit tests | ||
|
|
||
| - Unit tests: `*_test.py` - run in normal test target | ||
| - Integration tests: `*_integration_test.py` - get their own BUILD target with longer timeouts |
There was a problem hiding this comment.
integration tests typically run an instance of Pants, whereas unit tests run individual functions or rules.
|
|
||
| PEX is special - update both: | ||
| 1. The `pex-cli` subsystem in `src/python/pants/backend/python/util_rules/pex_cli.py` | ||
| 2. The requirement in `3rdparty/python/requirements.txt` and regenerate lockfile |
There was a problem hiding this comment.
This is not true any more. Pex is no longer consumed as a requirement, only as a CLI tool.
If this is still mentioned in the docs somewhere then that needs fixing.
I've had a lot of success getting LLMs to do what I want by threatening them with things that would probably get me flagged by HR 😆 |
|
Just a heads up, I have no decent opinions on any of this nor any good feedback. @tdyas @cburroughs added you two as reviewers, as I think you both have more experience (definitely more than me, anyways) re: approving this kind of PR I don't even know what a metric would be for this kinda thing 🤷🏽 My (EXTREMELY limited) experience with AGENTS.md and stuff is copied from some random Github repo that seemed to align with my mentality of "write as little code as possible, don't try to be too clever, ask when you don't understand" - seems to work 🤷🏽 |
| pants fmt src/python/pants/backend/python/goals/pytest_runner.py | ||
| pants check src/python/pants/backend/python/goals/pytest_runner.py | ||
|
|
||
| # WRONG - never do this |
There was a problem hiding this comment.
It hurts my soul that it's not enough to show just the positive approach, but you need to show the negative as well 🤦🏽
|
|
||
| ### Running Pants from source | ||
|
|
||
| In this repo, `./pants` is a special bootstrap script that runs Pants from the local source tree. Use `pants` (which resolves to the `./pants` script) for all operations. The first run compiles the Rust engine and may take several minutes. |
There was a problem hiding this comment.
For anything non-performance, until a certain upcoming PR gets done, it may be smarter to run:
MODE=debug pants to save on compilation time, and to re-use the same compiled objects for tests.
| @@ -0,0 +1,555 @@ | |||
| # Pants Build System Contributor Guide | |||
|
|
|||
| This is the Pants build system repo -- a self-hosting build system where `./pants` runs Pants from source. Follow the conventions, architecture, and workflows specific to this codebase. | |||
There was a problem hiding this comment.
Later on, the script says to use pants not ./pants
|
|
||
| Every directory with source code needs a `BUILD` file. Common patterns: | ||
|
|
||
| ```python |
There was a problem hiding this comment.
I don't use tailor - but I think running a tailor check would ensure this rule.
| | Show target info | `pants peek path/to/target` | | ||
| | Generate BUILD files | `pants tailor` | | ||
| | Validate BUILD files | `pants lint --only=ruff-format '**BUILD'` | | ||
| | Run pre-push checks | `build-support/githooks/pre-push` | |
There was a problem hiding this comment.
Is there value in also showing the merged forms? e.g. fix fmt lint check test?
| def rules(): | ||
| return [ | ||
| *collect_rules(), | ||
| *FooTestRequest.rules(), | ||
| ] |
There was a problem hiding this comment.
| def rules(): | |
| return [ | |
| *collect_rules(), | |
| *FooTestRequest.rules(), | |
| ] | |
| def rules() -> Iterable[Rule | UnionRule]: | |
| return ( | |
| *collect_rules(), | |
| *FooTestRequest.rules(), | |
| ) |
| - Must be complete sentences ending with a period | ||
| - Max 100 characters per line | ||
| - Explain **why**, not **what** | ||
| - TODOs must reference a GitHub issue: `# TODO(#1234): Description.` |
There was a problem hiding this comment.
I don't understand this. Is the expectation that we'd get new issues for a TODO that the machine creates?
|
What are people's thoughts on the fact that A trend I have been seeing is using |
| async def my_rule(request: MyRequest) -> MyResult: | ||
| ... | ||
|
|
||
| def rules(): |
There was a problem hiding this comment.
| def rules(): | |
| def rules() -> Iterable[Rule | UnionRule]: |
| python_tests( | ||
| name="integration", | ||
| sources=["*_integration_test.py"], | ||
| timeout=240, |
There was a problem hiding this comment.
This is only necessary if it times out
| - Be concise but descriptive | ||
| - Reference GitHub issues where applicable | ||
|
|
||
| ## Maintenance Tasks |
There was a problem hiding this comment.
I'd rather not have any PRs for lockfiles/tooling updates. I've got some PRs I'm working on to deterministically automate that stuff, so we don't need to burn down a rainforest to update a sha256.
Also, they're basically impossible to review for supply chain concerns.
| result.assert_success() | ||
| ``` | ||
|
|
||
| ## PR and Contribution Workflow |
There was a problem hiding this comment.
I don't see a section for the PR message itself, but full disclosure about LLM generation is important.
I was literally just joking about this with a friend after seeing some "Claude Code not available on $20 plan anymore" comments on the interwebs. Can only burn so much cash, and having to maintain some long Agents files may not be ideal. I don't know what the state-of-the-art is here, but if we had a slimmed down Agents, I would like to see some of the "never do this" cases that we've discussed over the past few weeks/months. As always, ghostty seems to have some good ideas: https://github.com/ghostty-org/ghostty/blob/main/AGENTS.md |
|
Question on this - what are the tangible next steps? There is a lot of (really good) content in the file, but there is also the question of what kind of need this has, token budgets, skills, etc. For a tangible next step, personally as a maintainer (again - LLMs are not really my thing), I would prefer what I've seen in some other repos I use, where AGENTS is used a bit more for HOW a clanker should interact with the repo/issues/PRs, rather than enumerating everything about the repo. Short and sweet, and with the intention of making our lives a bit easier with PRs we don't want to see, what we expect/don't, etc. Something more like:
As I say this, I guess that could also just be a CONTRIBUTING.md that this calls out to... |
Yes this. Models are getting better all the time and can just read code to discover information about the code itself. Moreover, I still have an objection to imposing large |
There was a problem hiding this comment.
I don't think we should have a single large AGENTS.md or CLAUDE.md file in the root of this repo. A small one? Yes. But not a large one like this. Most of the guidelines in here could be broken down into smaller "skills" (.claude/skills) that the agent would load when it needs them.
Guidelines for BUILD files? Make a build files skill that loads when working with BUILD files.
Guidelines for running tests? Make a tests skill that loads when working with tests.
etc
I think we can all agree that AI coding is here to stay at this point. Even as someone who was resistant for a long time, I've now been using Claude more and more and admittedly it can be extremely effective with surgical prompts and some light editing. I used it fairly extensively for this PR and generating the
AGENTS.md/CLAUDE.md, which I had it generate from the Pants docsite, made it much more effective at navigating the repo and usingpantscorrectly.