-
Notifications
You must be signed in to change notification settings - Fork 233
Add an LLM policy for rust-lang/rust
#1040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 2 commits
772edeb
815da6e
17a35f4
61e5e2c
8ee5ed4
7cd8c17
2db7465
9b2b3c2
e3b1394
e3f2aec
864428f
593d538
75050a2
b6a8662
9a944f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
oli-obk marked this conversation as resolved.
|
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,116 @@ | ||||||
| ## Policy | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Adding a title that mentions LLM usage, and flagging this as interim to foreshadow the section at the end noting that policies may evolve. I am hopeful that this is capturing a sentiment shared both by people who want the policy to be stricter and bypeople who want the policy to be less strict.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I stand by this policy. I would be happy for this to be a semi-permanent policy. We can of course edit it, but I consider "interim" to be a forward-looking statement and I don't want to make those in this policy. |
||||||
|
|
||||||
| For additional information about the policy itself, see [the appendix](#appendix). | ||||||
|
|
||||||
| ### Overview | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While this policy hasn't made everyone happy, I think it's struck a reasonable balance that seems to have resonated with a decent number of people. I think one of the reasons for that is that it avoids too much "framing". By "framing", I'm thinking about how we introduce the policy; how we talk about our thinking on AI as a project; and what the basis for a policy is. I think there's still a decent amount of disagreement on that and this explains why it's been a little difficult to make this more or less restrictive. There are two dominant starting points in our discussions so far - some of the project are starting from a position of "use of these tools are unethical for various reasons, therefore I don't think we should have anything to do with them, and so my starting point in this discussion is to prohibit all use" ("the ethical framing"), and others don't fully share those ethical concerns, so their starting position is "we've always let people whatever tools they want, and only introduced restrictions when they could be justified by an impact on the ability of the project to do its job (i.e. impact social cohesion, quality of the toolchain, sustainability of reviews, etc)" ("the pragmatic framing"). Starting points like these constitute the basis for any policy in the project - on what grounds we're imposing a restriction or permitting something - and this policy doesn't have much of one. This policy focuses mostly on what we permit or restrict, and while it touches on why briefly, it doesn't say too much - so you can read it as "these are the things I'm willing to compromise on from a prohibition on everything" or as "these are the restrictions we've landed on that are justified by impacts on the project". My concern is that without framing, we leave people to fill the gaps themselves and come to their own (potentially incorrect) conclusions about what the basis for this policy is. Within the project, that could be a problem if someone proposes a future change to this policy, because the author might have a "pragmatic framing" - seeing their amendment as justified by what we then know about the impact on the project - while a reviewer might see the amendment as an unreasonable ask for more compromise on total prohibition, as they understood the basis for this policy as coming from an "ethical framing". Outside the project, people might read this and see it as the project making an endorsement of AI because it permits some use, because they are inferring an "ethical framing", when an explanation of the "pragmatic framing" might be have cleared that up. I think the appropriate framing for the project to take with regards to policies like this is the "pragmatic framing". We're all entitled to our ethical concerns and criticisms of LLMs, and those perspectives are absolutely valid, but I believe it's very tricky to make them a solid basis for policy that encourages a diverse and varied community such as ours to work together. To give a silly example: if had the strong stance that we should ban all contributions that didn't use British English, because that's objectively correct and there's a King who'll say so, and that was a strong ethical/moral stance for me (feel free to replace this with a stance that you find more compelling), then how do we decide what to do with that? Others will disagree, so who wins? Whose ethical stance is correct? We could litigate the actual debate - but I don't think any of us want that. We could just pick the side of the person with the concern - but then we're effectively proscribing a "correct view" for contributors to have, and the more we do that, the fewer people will agree with every concern that has become policy - alienating more people from the project. It just isn't a tenable basis for policy. It might have been ten years ago with a much younger Rust, but for almost any issue, the ship has sailed, we've already got valued contributors who disagree on most topics. I want to be part of a project where we can each have our strongly-held perspectives, as long as we treat each other with dignity and respect, and can co-exist with those who might disagree - an issue is only relevant when it affects the project's ability to do its job (as some of the concerns with AI are keen demonstrations of, though not all of them). As such, I think the only practical basis for this policy is the "pragmatic framing", and as I've said above, I think we should include some preamble to any policy like this that describes the basis for the policy. I'm reminded of @nikomatsakis's earlier wording along these lines:
I had some similar phrasing in early sketches I had from a while back:
I don't want this comment to expand the scope of this policy too much - it's good that it is narrow and specific and concise, but I think it'll be easier to get people on-board when we're clear about the basis for the policy, as that has implications for whether it can evolve, and also in avoiding misinterpretation. I'm not tied to any of the specific phrasing in the quotes above, but feel free to use them as a starting point if inclined to act on this comment.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. <3 I like this a lot and I think it's the core of what makes this policy work. I've made this "overview" section shorter, but added a longer "motivation and guiding principles" section towards the end, with a modified version of Niko's quote. |
||||||
|
|
||||||
| Using an LLM while working on `rust-lang/rust` is conditionally allowed. | ||||||
| However, we find it important to keep the following points in mind: | ||||||
|
|
||||||
| - Many people find LLM-generated code and writing deeply unpleasant to read or review. | ||||||
| - Many people find LLMs to be a significant aid to learning and discovery. | ||||||
|
|
||||||
| Therefore, the guidelines are roughly as follows: | ||||||
|
|
||||||
| > It's fine to use LLMs to answer questions, analyze, distill, refine, check, suggest, review. But not to **create**. | ||||||
|
jyn514 marked this conversation as resolved.
|
||||||
|
|
||||||
| > LLMs work best when used as a tool to write *better*, not *faster*. | ||||||
|
|
||||||
|
Comment on lines
+16
to
+17
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Having this as a high-level summary is offering a judgement on LLMs that feels like it isn't necessary for the policy, and makes consensus more difficult to reach. For anti-LLM folks it's saying that they work best when used to write "better", which is a point in dispute. I would also expect (but don't want to put words in people's mouths) that for pro-LLM folks the point that they don't work well when used to work faster may be in dispute. I've tried to rephrase this in a fashion that, rather than expressing a general statement on when "LLMs work best", is instead expressing what is desired *for
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is adapted from a quote by @ubiratansoares. This edit changes the quote beyond recognition, and I would rather remove it than edit this much.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then I think it would be best removed, on the basis that previous line covers similar territory and seems less controversial.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tbh I don't actually understand what this quote is supposed to mean; if anything, I would phrase it the other way around (you can use LLMs to do [things you can already do] to get them done faster, but you shouldn't use them to do things you don't already know how to do you).
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly, it was the
that I took back to my team and reworked our approach to AI-generated code. I think that statement itself has a lot of weight. |
||||||
| #### Legend | ||||||
|
|
||||||
| - ✅ Allowed | ||||||
| - ❌ Banned | ||||||
| - ⚠️ Allowed with caveats. Must disclose that an LLM was used. | ||||||
|
jyn514 marked this conversation as resolved.
|
||||||
| - ℹ️ Adds additional detail to the policy. These bullets are normative. | ||||||
|
|
||||||
| ### Rules | ||||||
|
|
||||||
| #### ✅ Allowed | ||||||
| The following are allowed. | ||||||
| - Asking an LLM questions about an existing codebase. | ||||||
| - Asking an LLM to summarize comments on an issue, PR, or RFC. | ||||||
| - ℹ️ This does not allow reposting the summary publicly. This only includes your own personal use. | ||||||
| - Asking an LLM to privately review your code or writing. | ||||||
| - ℹ️ This does not apply to public comments. See "review bots" under ⚠️ below. | ||||||
| - Writing dev-tools for your own personal use using an LLM, as long as you don't try to merge them into `rust-lang/rust`. | ||||||
| - Using an LLM to discover bugs, as long as you personally verify the bug, write it up yourself, and disclose that an LLM was used. | ||||||
| Please refer to [our guidelines for fuzzers](https://rustc-dev-guide.rust-lang.org/fuzzing.html#guidelines). | ||||||
| - ℹ️ This also includes reviewers who use LLMs to discover bugs in unmerged code. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| #### ❌ Banned | ||||||
| The following are banned. | ||||||
| - Comments from a personal user account that are originally authored by an LLM. | ||||||
| - ℹ️ This also applies to issue bodies and PR descriptions. | ||||||
| - ℹ️ See also "machine-translation" in ⚠️ below. | ||||||
| - Documentation that is originally authored by an LLM. | ||||||
| - ℹ️ This includes non-trivial source comments, such as doc-comments or multiple paragraphs of non-doc-comments. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Reordering this to make it clear first and foremost that "Documentation" includes any doc comments, moving "non-trivial source comments" second. This also drops the quantitative "multiple paragraphs"; some multi-paragraph comments may be trivial, and some one-sentence comments may not be.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you are using an LLM to write a multi-paragraph comment that is trivial, IMO that should also be banned. If you have a load-bearing single-line comment, I think that falls under "code changes authored by an LLM", although I'm not sure how to say that concisely. |
||||||
| - ℹ️ This includes compiler diagnostics. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We cannot be exhaustive in this policy and I think it hurts us to try. |
||||||
| - Code changes that are originally authored by an LLM. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feels overly restrictive in the current wording in a way that I'm not sure I really am comfortable not raising a concern as compiler team member. There is some nuance here that this doesn't capture that I think should be. Certainly, I think in general, I'm happy to ban "unsolicited" code that is LLM-generated, but I think that an outright ban on all "non-trivial" LLM-generated code is too strong. I'd like to see LLM-generated code allowed under the following strong caveats:
I personally think this is a pretty reasonable space to carve out for "experimentation": it doesn't subject reviewers who don't want to review LLM-generated code to unwanted reviews, it helps to ensure that code stays high-quality, and it limits fallback of any "mistakes" in the process. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "The code is well-tested" is another valuable caveat to add here. Requiring this is much less onerous in the context of LLM-assisted code.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like it. I think it's a standard we want to hold for all contributions, but doesn't always get met. It's a nice position to have here.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd quite like to see an explicit carve out for teams or even individuals to do some experimentation - in specific areas or with specific maintainers that wouldn't affect maintainers who aren't interested in participating. Teams would obviously need to decide if they wanted to have such an experiment, but it would be useful input to any future revisions - e.g. "hey, we tried this in a controlled environment over here and we actually found it useful and helpful, maybe we could consider relaxing this point", etc. |
||||||
| - This does not include "trivial" changes that do not meet the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html), which fall under ⚠️ below. | ||||||
| We understand that while asking an LLM research questions it may, unprompted, suggest small changes where there really isn't another way to write it. | ||||||
| However, you must still type out the changes yourself; you cannot give the LLM write access to your source code. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Trying to address potential reactions of "why are you making me re-type this?!".
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if we want a rationale for each rule that should be a separate document, not part of the moderation and contributing guidelines.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very weird to me. Either the change is small enough to be trivial, or it is not. I'm not sure what typing it out does? Beyond this, it's not clear what this is aimed at? Is this aimed at when someone is conversing back and forth with an agent and they say "I suggest you do XYZ", or is this aimed at autocomplete-like code generation.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've removed the requirement to type out the code yourself. |
||||||
| - We do not accept PRs made up solely of trivial changes. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is really just not correct. We accept all the time trivial changes (e.g. renaming a struct because it's confusing). It's sort of like what Josh is saying: what does "trivial" mean.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've reworded this significantly, let me know what you think. |
||||||
| See [the compiler team's typo fix policy](https://rustc-dev-guide.rust-lang.org/contributing.html#writing-documentation:~:text=Please%20notice%20that%20we%20don%E2%80%99t%20accept%20typography%2Fspellcheck%20fixes%20to%20internal%20documentation). | ||||||
| - See also "learning from an LLM's solution" in ⚠️ below. | ||||||
| - Treating an LLM review as a sufficient condition to merge a change. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| LLM reviews, if enabled by a team, **must** be advisory-only. | ||||||
| Teams can have a policy that code can be merged without review, and they can have a policy that code must be reviewed by at least one person, | ||||||
|
Comment on lines
+51
to
+52
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that this is limited to rust-lang/rust, probably better to just restrict to no LLM reviews.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I actually really want to keep allowing LLM reviews. I think they're low-risk and give people a chance to see whether the bot catches real issues. |
||||||
| but they may not have a policy that an LLM counts as a person. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| - ℹ️ See "review bots" in ⚠️ below. | ||||||
| - ℹ️ An LLM review does not substitute for self-review. Authors are expected to review their own code before posting and after each change. | ||||||
|
|
||||||
| #### ⚠️ Allowed with caveats | ||||||
| The following are decided on a case-by-case basis. | ||||||
| Please avoid them where possible. | ||||||
| In general, existing contributors will be treated more leniently here than new contributors. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| We may ask you for the original prompts or design documents that went into the LLM's output; | ||||||
| please have them on-hand, and be available yourself to answer questions about your process. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| - Using an LLM to generate a solution to an issue, learning from its solution, and then rewriting it from scratch in your own style. | ||||||
|
jyn514 marked this conversation as resolved.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course, see my comment on the "Code changes that are originally authored by an LLM." ban, but I do like laying out this "less-restrictive" point explicitly. I would move the "asking for details about how you generated the solution" to under this point, but modify it heavily. Rather than stating like "we need to know exactly what you said to the LLM and what model you used", I think a better approach is saying something like "You should be prepared to share the details of the direction you gave to the LLM. These may include general prompts or design documents/constraints." I'm not sure that sharing the exact prompts or output, or the exact model does anything. What's the reasoning? I'm much more interested in what direction the author intended to take. If the idea is to be able to "recreate" or "oversee" what the author did, that's just never going to work. This isn't something we can reasonably expect reviewers at large to do. Rather, if anything, this is something that I could see from a more mentor/mentee relationship. If it ever is at the point that a "random" reviewer wanted or needed to see this, then the PR likely just needs to be closed and further discussion should happen elsewhere before continuing. |
||||||
| - Using machine-translation from your native language without posting your original message. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| Doing so can introduce new miscommunications that weren't there originally, and prevents someone who speaks the language from providing a better translation. | ||||||
| - ℹ️ Posting both your original message and the translated version is always ok, but you must still disclose that machine-translation was used. | ||||||
| - ℹ️ This policy also applies to non-LLM machine translations such as Google Translate. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| - Using an LLM as a "review bot" for PRs. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I'm OOTL but I find this section situationally strange — where did the "review bot" come from? IME AI-powered review bots that directly participates in PR discussions (esp the "app" ones) are configured by repository owner, but AFAIK r-l/r (which this policy applies solely to) did not have any such bots. I highly doubt a contributor will bring in their own review bot in public. So practically this has to be either
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wish it worked like that :( People can just trigger GitHub copilot, or I suppose any other review bot, and let it comment on a r-l/r PR. Some people don't even do it willingly, but GH does it automatically for them, as GH copilot has a tendency to re-enable itself even if you sometimes disable it. It is also not possible to opt-out of the PR author requesting a Copilot review, if I remember correctly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’ve seen this behavior elsewhere on GitHub, where contributors effectively use a personal account as a kind of "review bot" to comment on PRs without approval from maintainers.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah currently disabling review is a personal/license-owner setting, it is not possible to configure from the repository PoV 😞 but I think this is something that we may bring up to GitHub. It may be possible to use content exclusion to blind Copilot, but I'm not sure if this hack is going to produce any overreaching effects (e.g. affecting private IDE usage too).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this is exactly the point of pointing that out in our policy. Some people trigger a "[at]copilot review" in our repos without asking us for consent. This is rude behaviour and we don't want that. And, yes, as you point out opting out of this "trigger" is currently only a project-wide setting, not at a repository level so we are looking with GitHub if they could make this setting more fine-grained (here on Zulip a discussion with the Infra team)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @clarfonthey I understand you are frustrated but it doesn't help to take it out on the people we're working with. Can I ask you to take a break from commenting on this RFC for a bit? Feel free to DM me with any concerns you have about the policy itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, you're right; I deleted the comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Unsolicited review bots are becoming an increasing problem; for example: https://web.archive.org/web/20260426133344/https://github.com/rust-lang/rust-clippy/issues/16893#issuecomment-4321880160 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for flagging xtqqczze - the same bot has commented in 6+ issues on the rust-clippy repo and in my case was giving unsolicited advice in a completely derailing direction (solving a specific case I obviously already worked around rather than the general case rust-lang/rust-clippy#16901 (comment))
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @xtqqczze both rust-lang/rust-clippy#16893 and rust-lang/rust-clippy#16901 are issues not PRs, and that |
||||||
| - ℹ️ Review bots **must** have a separate GitHub account that marks them as an LLM. They **must not** post under a personal account. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| - ℹ️ Review bots that post without being approved by a maintainer will be banned. | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm concerned this leaves room for reviewers to trigger a review bot without consent of the author of the PR, which could alienate the PR author. If I opened a PR and it got reviewed by an LLM bot, I would probably close the PR and never try contributing to the project again. I've seen this happen in another project. I think there should be an agreement between the reviewer and PR author before triggering a review bot.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "approved by a maintainer" is the key point here, if an LLM review bot is "approved by a maintainer" it means such is a public decision and should be mentioned in CONTRIBUTING.md, and that's the agreement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. An agreement among maintainers to impose LLM review bots on nonconsenting contributors would drive those contributors away. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If a reviewer really wants to use an LLM to review, they could run that LLM on their own, filter through the output to determine what is actually relevant and correct, and post in their own words about the identified problems. That doesn't require bothering a nonconsenting PR author with LLM output.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rephrasing LLM output is already addressed in lines 67-68. The premise of this whole section is that somehow a bot (as a separate account, line 69) can be officially " If you think that a review bot account should not be allowed, even if approved by maintainers, this whole thread would be more relevant on the parent item (line 66; I've commented about this before). P.S. I don't think this policy implies any LLM review bot account will be allowed "right now" or "soon", I believe there must at least be an FCP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thinking about this further, this seems like an overall better process than having a review bot comment on a PR. There's no room for ambiguity about whether a PR author is responsible for responding to LLM output; only the reviewer who decides to use an LLM is in a position to interpret the LLM output because "Comments from a personal user account that are originally authored by an LLM" are explicitly forbidden. |
||||||
| - ℹ️ If a linter already exists for the language you're writing, we strongly suggest using that linter instead of or in addition to the LLM. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| - ℹ️ Please keep in mind that it's easy for LLM reviews to have false positives or focus on trivialities. We suggest configuring it to the "least chatty" setting you can. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| - ℹ️ LLM comments **must not** be blocking; reviewers must indicate which comments they want addressed. It's ok to require a *response* to each comment but the response can be "the bot's wrong here". | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's okay to require PR authors to have to say "the bot's wrong here"; the onus should be on whoever triggers the bot to determine whether there's any validity to what the bot posted.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't see how line 73 disagrees with this. The statement "It's ok to require a response" refers to the reviewer requiring a response from the author to address the bot comment, not from the bot itself. The previous statement "reviewers must indicate which comments they want addressed." also suggested that the reviewer has taken the 'onus' of the bot comment. In this scenario I don't find requiring the PR author to say "the bot's wrong here" to dismiss the comment is unfair to the author; in fact, having that 2nd step "reviewers must indicate which comments they want addressed" means the PR author is in fact rejecting the combined analysis of the bot and the reviewer, so I'd say this is more biased against reviewers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current wording is a bit ambiguous and could conceivably be interpreted to mean that "it's okay to require a response" implicitly. I would like to see this clarified to say explicitly that a bot's comment only needs to be responded to if a reviewer explicitly indicates that. |
||||||
| - In other words, reviewers must explicitly endorse an LLM comment before blocking a PR. They are responsible for their own analysis of the LLM's comment and cannot treat it as a CI failure. | ||||||
| - ℹ️ This does not apply to private use of an LLM for reviews; see ✅ above. | ||||||
|
|
||||||
| All of these **must** disclose that an LLM was used. | ||||||
|
|
||||||
| ## Appendix | ||||||
|
|
||||||
| ### No witch hunts | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| ["The optimal amount of fraud is not zero"](https://www.bitsaboutmoney.com/archive/optimal-amount-of-fraud/). | ||||||
| Do not try to be the police for whether someone has used an LLM. | ||||||
| If it's clear they've broken the rules, point them to this policy; if it's borderline, report it to the mods and move on. | ||||||
|
jyn514 marked this conversation as resolved.
|
||||||
|
|
||||||
| Conversely, lying about whether you've used an LLM is an instant [code of conduct](https://rust-lang.org/policies/code-of-conduct/) violation. | ||||||
| If you are not sure where you fall in this policy, please talk to us. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
| Don't try to hide it. | ||||||
|
|
||||||
| ### Responsibility | ||||||
|
|
||||||
| All contributions are your responsibility; you cannot place any blame on an LLM. | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Clarity / wording. |
||||||
| - ℹ️ This includes when asking people to address review comments originally authored by an LLM. See "review bots" under ⚠️ above. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| ### "originally authored" | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
jyn514 marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| This document uses the phrase "originally authored" to mean "text that was generated by an LLM (and then possibly edited by a human)". | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m not comfortable with the definition of "originally authored" as written here. Authorship is something that applies to a person, not tools; a LLM can generate text, but it isn’t an author. |
||||||
| No amount of editing can change authorship; authorship sets the initial style and it is very hard to change once it's set. | ||||||
|
jyn514 marked this conversation as resolved.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Taking a different approach here, of narrowing the focus to the phrasing in this policy, rather than trying to get people to agree with the fully general statement. |
||||||
|
|
||||||
| For more background about analogous reasoning, see ["What Colour are your bits?"](https://ansuz.sooke.bc.ca/entry/23) | ||||||
|
|
||||||
| ### Non-exhaustive policy | ||||||
|
jyn514 marked this conversation as resolved.
|
||||||
|
|
||||||
| This policy does not aim to be exhaustive. | ||||||
| If you have a use of LLMs in mind that isn't on this list, judge it in the spirit of this overview: | ||||||
| - Usages that do not use LLMs for creation and do not show LLM output to another human are likely allowed ✅ | ||||||
| - Usages that use LLMs for creation or show LLM output to another human are likely banned ❌ | ||||||
|
|
||||||
| This policy is not set in stone. | ||||||
| We can evolve it as we gain more experience working with LLMs. | ||||||
|
jyn514 marked this conversation as resolved.
Outdated
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would feel better if we made this policy explicitly time-limited or tied to a process of gathering more information. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Niko, you're one of the loudest voices trying to dictate the direction we're going. I would argue that a majority of the pushback from sensible policies like this one have come from you; since you're effectively the project manager for the project, your voice carries further than a dozen people's, and it feels like you're genuinely oblivious to this. Plus, a lot of the arguments you've offered have been from the position that whatever you think is reasonable is canonically reasonable, which is perspective that resists all form of negotiation. We all agree that this policy is not going to be permanent, but a large portion of the project seems to be in agreement that this should be the policy we adopt until a project-wide policy is adopted. It's also worth noting, since it's been brought up multiple times, that we don't do policy by majority vote. This is even true for a policy like this one: if we did majority vote, we'd just ban all LLM usage, but we're not doing that because we're willing to compromise. Right now, it seems pretty unsubstantiated that a handful of voices have dictated this position. While it's true that a small number of people have been active in the policy channel, a majority of the project have pointed out their desire for a total ban on LLM usage. This, being noticeably more lenient on that, is a compromise from us. You should consider whether you're willing to compromise at all on your stance, and what compromise would mean for you. As I mentioned in one of the discussions, I do think it's a false equivalence that both sides need to concede something, but if you don't even know what it means to compromise, then negotiation is utterly impossible. I really am not convinced that you understand what a compromise of the pro-LLM position would be, based upon the utter confusion you've expressed when mentioning that some of the contributions you've done would not be acceptable under some of the proposed policies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not plan to actually engage in this conversation any further (I acknowledge my biases and when to step out), but I think it's worth pointing out to the at-least-5 people who gave a thumbs-down reaction to my comment that I personally have a rule when it comes to this. If I ever decide to mark my dissent on a comment with the thumbs-down emoji, I always reply explaining why unless everything I wish to say has already been said. Many times, the result is far more critical to the poster than a simple emoji, but I do this because I genuinely want people to understand why I feel a particular way, rather than just saying "I don't like this and will not explain why." We don't improve if we don't know what's wrong. My above comment, in my mind, is required to give Niko's a thumbs-down reaction, because otherwise I'm being insincere to him and everyone else reading. I do not say that I disagree with something without saying why; in that case, it's better to not say anything at all. Again, I acknowledge that my explanation can be deeply hurtful. Disagreement is a painful but necessary process. I also know that there are plenty of times where I have been excessively hurtful without providing the relevant constructive feedback, and think it's worth calling me out for that. I don't apply my standards to anyone else. Lots of people just don't have time to write up a full response. But I personally, in these cases, simply don't respond at all. So, consider whether your simple thumbs-down emoji constitutes genuinely useful feedback, or whether you're just being excessively hurtful instead. And, if you would like to express your dissent in private, I'm open to DMs on Zulip too; this is an open invitation to just say what you feel without a filter. It would be hypocritical of me to be so blunt with my opinions and not accept the same in kind.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll reply here since this is 'the thread', but I want to say first that I don't agree with much of what you wrote @clarfonthey. I believe Niko is raising a concern in good faith, though I'd like to understand it better.
@nikomatsakis, can you elaborate on specifically what information you think we should be seeking, and what process you're imagining for iterating towards a better policy? Are you seeking commitment from folks who have engaged so far to keep engaging in discussion on Zulip? Something else? I'd be happy to chat offline (Zulip or more sync meeting if you'd prefer) if that makes more sense. I see @jyn514 left a comment below with some more data on project opinions, but it's not clear to me if that's the kind of data you're seeking, or something else. Could you elaborate on what you're looking for and what kinds of process/timeline you would find better than the copious discussion and iteration that has landed us on this (and some other) proposals? I personally think a policy like this one that is relatively restrictive, but scope-limited and leaving room for usage in other areas of the Project gives us a good balance for continued input on where the world is and leaving the door open for private usage for those comfortable with doing so. That combination seems guaranteed to ensure we're not going to stop discussing since everyone seems to want something different from this policy, even if we manage to get to consensus on landing this in the meantime.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I appreciate the vote of confidence, Mark. And @clarfonthey I appreciate that I have reputational clout in the project -- though I'd also note that it doesn't usually translate into me getting my way without fighting for it tooth and nail. =) In any case, I wouldn't be speaking up this much if I didn't feel it was important. To answer your questions Mark:
No, I think the Zulip discussions are not useful. I want to see a more structured process. I think it would look like this:
For example, @jyn514 has expressed openness to having a separate review queue for "LLM-authored content". How many others on the compiler team share that opinion? I have no idea. And of course @clarfonthey has expressed ethical concerns, and I don't really know how many people share that bright red line. And that's just existing maintainers, what about people who've opened PRs in the last year? How many of them work with LLMs at work or on a daily basis? What are there experiences like? Another thing I'm very curious to understand, something I think could be useful, is -- what are people afraid of or hopeful for as a result of this policy. That might inform the conversation. For example, for me, one of my big fears is that if we will be distancing ourselves from future contributors, many of whom will be coding with LLMs. When Rust started, we made a deliberate choice to use Github and not Bugzilla because, frankly, Github is where the people are. I would be interested to see if the perspectives around LLM usage vary between existing maintainers and future contributors or along other lines. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to https://rethinkpriorities.org/research-area/adoption-llms-tech-workers/ 91% of respondants have used LLMs for work with 29% using it daily. This data is a year old now, and I have many reasons to believe usage of tools like Claude Code has only increased since then, and dramatically at that. Of the four tech companies I have direct knowledge of, all four have gone from AI being used for coding by a minority of developers, to being used by almost every developer for coding in that time. In two of them using AI coding tools is practically mandatory. It's also quite clear to everyone that companies like Anthropic are struggling to keep up with the growth in usage. I personally have approached AI with extreme skepticism from the beginning, and I still consider its functionality to be dramatically oversold by the companies selling it, but it's extremely widely used already, is already a very effective tool when used correctly, and I think @nikomatsakis is absolutely correct in thinking this will distance contributors who would use it, which is now essentially all new programmers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the company I work for, it's also the case that most people use AI quite a bit; however, multiple new employees have expressed that they don't want to use it much or at all because it could interfere with learning. They want to go from junior engineers to senior engineers, and the best way we know to do that is hands-on experience. So I agree that it's become an industry standard (and policies which do not reflect this may be unsustainable); however, it's not necessarily true that all new programmers will be AI users initially.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want to second @nikomatsakis's point, mostly: I'm not sure that I necessarily care that this is "time limited", but with as-restrictive as this is, I don't want us to merge this and think it is "enough". I also don't know what "correct" here looks like. Let me try to spell out exactly what I would and would not like:
In all, I think best said: I don't want us to think of this policy as "done". I want it to be as another stepping stone in figuring out what works. I don't think "only talking" gets us very far (which is why some policy, even if more restrictive or less restrictive than some would like, is still a good step), but I don't think that this is a "solution", only another means to help us figure out what works for the Project. I don't want us to merge this and then any time we are discussing, someone can just point and say: "look, we merged a policy, why are we still discussing this?" Unfortunately, we're bad at ensuring we don't set something down and forget to pick it up again. A time-limited or event-limited policy can help with this. If we said "this policy is only in effect for a year", then in a year, we must reevaluate whether this policy "worked" and what changes (if any, should be made). I'm not sure what an "event-limited" process would look like: but I could imagine it's some combination of doing a survey, identifying key "events" like e.g. a capable/free "open model" being available, additional tooling being built that could obviate the need for some of this policy, the Project gaining consensus on a Project-wide policy, some team raising a concern, etc. I imagine what we actually want is some combination. Just taking a stab:
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
👍 I like the idea of having an escape hatch if there's a crisis.
I think this is implied, but 👍 to spelling it out explicitly.
I don't like that this leaves no room for a project-wide policy that allows teams to set more specific policies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there's a sunset clause, what's the fallback policy? Ideally it's a policy that everyone dislikes, so there's incentive to properly fix it. The current status quo seems to be... fully permissive but also people will get mad at you if you submit LLM-generated work? That seems less than ideal.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is my second point includes not just time passing, but also the Project-wide policy (which is, I guess, the "fallback"). I don't necessarily everyone think has to dislike that, but rather that needs to be something more fundamentally shared across the entire project than a rust-lang/rust specific policy. The other two points are an active dissolution than fundamentally requires either consensus (same as forming the policy), or evidence of active harm. |
||||||
Uh oh!
There was an error while loading. Please reload this page.