doc: create ai-guidelines and include to CONTRIBUTING#62105
Conversation
|
Review requested:
|
|
There may be some ideas we can borrow from https://llvm.org/docs/AIToolPolicy.html - for example "good first issue" should not be picked up by AI is a good one. |
I took inspiration from https://github.com/zulip/zulip/blob/main/CONTRIBUTING.md#ai-use-policy-and-guidelines |
| * **Verify accuracy** of any LLM-generated content before including it in a | ||
| PR description or comment. | ||
| * **Complete pull request templates fully** rather than replacing them with | ||
| LLM-generated summaries. |
There was a problem hiding this comment.
Do we have a template? I thought those are for issues, not PRs.
There was a problem hiding this comment.
Not strictly a template: https://github.com/nodejs/node/blob/main/.github/PULL_REQUEST_TEMPLATE.md?plain=1
There was a problem hiding this comment.
It's not possible to fulfil the instructions "Complete pull request templates fully" based on the contents of https://github.com/nodejs/node/blob/main/.github/PULL_REQUEST_TEMPLATE.md?plain=1 so it looks like this sentence needs to be removed.
There was a problem hiding this comment.
I'd be against this contribution policy update. While many different opinions exist on the licensing terms of the code produced by LLMs, my opinion is that the generated code isn't explicitly licensed and attributed to the original authors so it cannot be considered open source regardless of the used prompt.
| * **Verify accuracy** of any LLM-generated content before including it in a | ||
| PR description or comment. | ||
| * **Complete pull request templates fully** rather than replacing them with | ||
| LLM-generated summaries. |
There was a problem hiding this comment.
| * **Verify accuracy** of any LLM-generated content before including it in a | |
| PR description or comment. | |
| * **Complete pull request templates fully** rather than replacing them with | |
| LLM-generated summaries. | |
| * **Verify accuracy** of any LLM-generated content before including it in a | |
| PR description or comment. |
|
In #61478 (comment) , regarding the usage of Claude Code, @mcollina suggested:
I added it to the TSC agenda tomorrow for awareness/context collection before moving to a proper vote. @indutny sorry about the short notice since this is just one day ahead of the meeting, but if you'd like to join the meeting to present your points please let us know. |
mcollina
left a comment
There was a problem hiding this comment.
lgtm with a sentence removed
|
@joyeecheung thanks for considering me for this! I'd be happy to join if the time isn't in conflict with my work meetings tomorrow. Could you send me an invite, please? |
|
nodejs/TSC#1830 this is the issue. It might be reschedueld to the next meeting if there is low attedance given the timezone. I think we can schedule the discussion for April 1st which will be in the morning PT so a lot of people can join, and I would table the vote for that session. I would try to get an answer from the Board by then. |
joyeecheung
left a comment
There was a problem hiding this comment.
LGTM % a suggestion to spell out exactly how the "approval of automated tooling" works
| PR description, but not in the commit message, unless the context would not | ||
| have made sense without mentioning the specific brand/trademark. The goal | ||
| is to prevent the commit messages, which are part of the code base, from being | ||
| abused for for-profit marketing. |
There was a problem hiding this comment.
Still generally disagree that this recommendation is at all necessary.
jasnell
left a comment
There was a problem hiding this comment.
Good enough for me. There are a few details I don't think are strictly necessary but given that they are recommendations and not hard requirements, I can live with it.
|
Discussed at the last @nodejs/tsc. |
84a3b2d to
b696334
Compare
Co-authored-by: Beth Griggs <bethanyngriggs@gmail.com> Co-authored-by: Aditi <62544124+Aditi-1400@users.noreply.github.com> Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com> Co-authored-by: Tobias Nießen <tniessen@tnie.de> Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com> Co-authored-by: Mike McCready <66998419+MikeMcC399@users.noreply.github.com> Co-authored-by: Efe <dogukankrskl@gmail.com> Co-authored-by: James M Snell <jasnell@gmail.com> Co-authored-by: Trivikram Kamat <16024985+trivikr@users.noreply.github.com> Signed-off-by: RafaelGSS <rafael.nunu@hotmail.com>
b696334 to
2c97efd
Compare
|
I have squashed the commits and solved the linting issue. Nothing has changed. PTAL. |
Per Node.js guidelines, after 2 weeks or more without a response, we consider the objection dismissed.
joyeecheung
left a comment
There was a problem hiding this comment.
Some wording tightening and adding references..
Also I believe we should reference the large PR policy added by #62829 in this document, though not exactly sure where?
| Contributors may use AI tools to assist with contributions, but such tools | ||
| never replace human judgment. | ||
|
|
||
| When using AI as a coding assistant: |
There was a problem hiding this comment.
| Contributors may use AI tools to assist with contributions, but such tools | |
| never replace human judgment. | |
| When using AI as a coding assistant: | |
| Tools should never replace human judgment, regardless of whether they are powered by AI. |
| * **Understand the codebase first.** Do not skip familiarizing yourself with | ||
| the relevant subsystem. LLMs frequently produce inaccurate descriptions of | ||
| Node.js internals — always verify against the actual source. When using an AI | ||
| tool, ask it to cite the exact source it’s relying on, and then | ||
| match the claim against that resource to verify if it holds up in the current | ||
| code. |
There was a problem hiding this comment.
| * **Understand the codebase first.** Do not skip familiarizing yourself with | |
| the relevant subsystem. LLMs frequently produce inaccurate descriptions of | |
| Node.js internals — always verify against the actual source. When using an AI | |
| tool, ask it to cite the exact source it’s relying on, and then | |
| match the claim against that resource to verify if it holds up in the current | |
| code. | |
| * **Understand the codebase first.** Do not skip familiarizing yourself with | |
| the relevant subsystem. Always verify analysis generated by tools against | |
| the actual source code. |
| * **Own every line you submit.** You are responsible for all code in your | ||
| pull request, regardless of how it was generated. This includes ensuring | ||
| that AI-generated or AI-assisted contributions satisfy the project's | ||
| [Developer's Certificate of Origin][] and licensing requirements. Be | ||
| prepared to explain any change in detail during review. |
There was a problem hiding this comment.
| * **Own every line you submit.** You are responsible for all code in your | |
| pull request, regardless of how it was generated. This includes ensuring | |
| that AI-generated or AI-assisted contributions satisfy the project's | |
| [Developer's Certificate of Origin][] and licensing requirements. Be | |
| prepared to explain any change in detail during review. | |
| * **Own every line you submit.** You are responsible for all code in your | |
| pull request, regardless of how it was created. The submitted changes | |
| must satisfy the project's [Developer's Certificate of Origin][] and licensing | |
| requirements. Be prepared to explain any change in detail during review. |
| * **Keep logical commits.** Structure commits coherently even when an LLM | ||
| generates multiple changes at once. Follow the existing | ||
| [commit message guidelines][]. |
There was a problem hiding this comment.
| * **Keep logical commits.** Structure commits coherently even when an LLM | |
| generates multiple changes at once. Follow the existing | |
| [commit message guidelines][]. | |
| * **Keep the commits logical.** The [commit message guidelines][] | |
| and [commit squashing guidelines](./pull-requests.md#commit-squashing) | |
| must be followed regardless of what tool is used in the pull request. |
| * **Test thoroughly.** AI-generated code must pass the full test suite and | ||
| any manually written tests relevant to the change. Existing tests should not | ||
| be removed or modified without human verification. Do not rely on the LLM | ||
| to assess correctness. It is crucial to manually verify the correctness of | ||
| tests against the expected behavior of the feature being tested, | ||
| independently of the feature's implementation. |
There was a problem hiding this comment.
| * **Test thoroughly.** AI-generated code must pass the full test suite and | |
| any manually written tests relevant to the change. Existing tests should not | |
| be removed or modified without human verification. Do not rely on the LLM | |
| to assess correctness. It is crucial to manually verify the correctness of | |
| tests against the expected behavior of the feature being tested, | |
| independently of the feature's implementation. | |
| * **Test thoroughly.** Existing tests should not be removed or modified | |
| without human verification. It is crucial to verify, with human judgement, | |
| the correctness of tests against the expected behavior of the feature being | |
| tested, independently of the feature's implementation. |
| * **Edit generated comments critically.** LLM-generated comments are often | ||
| verbose or inaccurate. Remove comments that simply restate what the code | ||
| does; add comments only where the logic is non-obvious. |
There was a problem hiding this comment.
| * **Edit generated comments critically.** LLM-generated comments are often | |
| verbose or inaccurate. Remove comments that simply restate what the code | |
| does; add comments only where the logic is non-obvious. | |
| * **Keep the comments useful.** Verify with human judgement that the | |
| comments are necessary and accurate. Remove comments that simply | |
| restate what the code does. Add comments only where the logic is non-obvious. |
| be done in the form of a GitHub workflow, submit a pull request to add the workflow | ||
| and use the usual pull request review process to seek consensus. | ||
|
|
||
| ## Using AI for code contributions |
There was a problem hiding this comment.
| ## Using AI for code contributions | |
| ## When AI is used in contributions |
| verbose or inaccurate. Remove comments that simply restate what the code | ||
| does; add comments only where the logic is non-obvious. | ||
|
|
||
| ## Using AI for communication |
There was a problem hiding this comment.
| ## Using AI for communication | |
| ## When AI is used for communications |
| * **Do not post messages generated entirely by AI** in pull requests, issues, or the | ||
| project's communication channels. | ||
| * **Verify accuracy** of any LLM-generated content before including it in a | ||
| PR description or comment. | ||
| * **Link to primary sources** — code, documentation, specifications — rather | ||
| than quoting LLM answers or linking to LLM chats. |
There was a problem hiding this comment.
| * **Do not post messages generated entirely by AI** in pull requests, issues, or the | |
| project's communication channels. | |
| * **Verify accuracy** of any LLM-generated content before including it in a | |
| PR description or comment. | |
| * **Link to primary sources** — code, documentation, specifications — rather | |
| than quoting LLM answers or linking to LLM chats. | |
| * **Do not post messages generated entirely by AI** in pull requests, issues, or the | |
| project's communication channels. Such communication may be removed in accordance | |
| to [the Node.js moderation policy](https://github.com/nodejs/admin/blob/main/Moderation-Policy.md). |
I think we should hash out the "what can be in communications" in the moderation policy, not here..
| If the disclosure involves trademarks or commercial brands, it's recommended to | ||
| anonymize it in the commit message or only mention the brand/trademark in the | ||
| PR description, but not in the commit message, unless the context would not | ||
| have made sense without mentioning the specific brand/trademark. The goal | ||
| is to prevent the commit messages, which are part of the codebase, from being | ||
| abused for profit-driven marketing. |
There was a problem hiding this comment.
| If the disclosure involves trademarks or commercial brands, it's recommended to | |
| anonymize it in the commit message or only mention the brand/trademark in the | |
| PR description, but not in the commit message, unless the context would not | |
| have made sense without mentioning the specific brand/trademark. The goal | |
| is to prevent the commit messages, which are part of the codebase, from being | |
| abused for profit-driven marketing. | |
| Be aware that the mention of for-profit trademarks or commercial brands in commit | |
| messages, which are part of the code base, can be abused for profit-driven marketing. | |
| If the disclosure involves for-profit trademarks or commercial brands, it's | |
| recommended to either anonymize the branding (e.g. say `a frontier reasoning model`, | |
| `a closed-source coding agent` instead of `<brand>`), or only mention the | |
| for-profit brand/trademark in the PR description, but not in the commit message, | |
| unless the message would not have made sense without mentioning the specific | |
| brand/trademark. These recommendations only apply to for-profit tools/models, not | |
| any non-profit ones. |
(I don't know how far we are from this, but it's seems totally possible that there can be public, non-profit model providers in the future. That and non-profit open source harness already exists, in those cases these recommendations wouldn't make as much sense).
As discussed in today's TSC meeting.
cc: @nodejs/tsc @BridgeAR