Skip to content

Harden commit trailer subprocess handling and align trailer I/O paths#2125

Merged
Byron merged 6 commits intomainfrom
copilot/address-review-comments-2116
Apr 13, 2026
Merged

Harden commit trailer subprocess handling and align trailer I/O paths#2125
Byron merged 6 commits intomainfrom
copilot/address-review-comments-2116

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 12, 2026

  • Retrieve the latest PR review comments and inspect the affected trailer code
  • Evaluate which new comments are substantive versus non-meritorious or nit-level
  • Fix the substantive subprocess-lifetime issue in trailer handling
  • Add focused regression coverage for the new failure path if needed
  • Install local validation dependencies if required in this environment
  • Re-run focused lint, typecheck, and tests
  • Run final validation and summarize the comments left unaddressed

Copilot AI requested a review from Byron April 12, 2026 23:46
@Byron Byron marked this pull request as ready for review April 13, 2026 03:22
Copilot AI review requested due to automatic review settings April 13, 2026 03:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors how commit trailers are generated/parsed via git interpret-trailers, aiming to harden subprocess handling and to better align trailer processing with commit message encoding behavior.

Changes:

  • Centralize git interpret-trailers invocation in a new Commit._interpret_trailers() helper and reuse it from both trailer parsing and trailer application.
  • Adjust trailer parsing (trailers_list) and trailer application (create_from_tree) to use the shared helper.
  • Add a test that exercises trailer creation/parsing when i18n.commitencoding is set to a non-default encoding (ISO-8859-1).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
git/objects/commit.py Introduces _interpret_trailers() helper and updates trailer parse/apply paths to use it.
test/test_commit.py Adds coverage for trailer handling when i18n.commitencoding is configured to ISO-8859-1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

)
trailer: str = proc.communicate(str(self.message).encode())[0].decode("utf8")
trailer = trailer.strip()
trailer = self._interpret_trailers(self.repo, self.message, ["--parse"]).strip()
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailers_list calls _interpret_trailers without passing/using the commit's actual encoding (self.encoding). Since commit messages can be configured via i18n.commitencoding, trailer parsing should encode/decode using that encoding to avoid mis-decoding or UnicodeDecodeError for non-UTF-8 commit messages. Consider passing self.encoding through to _interpret_trailers (or have the helper derive the encoding from the commit/repo config).

Suggested change
trailer = self._interpret_trailers(self.repo, self.message, ["--parse"]).strip()
proc = Popen(
[Git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", "--parse"],
stdin=PIPE,
stdout=PIPE,
stderr=PIPE,
cwd=self.repo.working_tree_dir or self.repo.git_dir,
)
stdout, _stderr = proc.communicate(self.message.encode(self.encoding))
finalize_process(proc)
trailer = stdout.decode(self.encoding).strip()

Copilot uses AI. Check for mistakes.
Comment on lines +466 to +476
def _interpret_trailers(cls, repo: "Repo", message: Union[str, bytes], trailer_args: Sequence[str]) -> str:
cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args]
proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload]
cmd,
as_process=True,
istream=PIPE,
)
message_bytes = message if isinstance(message, bytes) else message.encode(cls.default_encoding, errors="strict")
stdout_bytes, _ = proc.communicate(message_bytes)
finalize_process(proc)
return stdout_bytes.decode(cls.default_encoding, errors="strict")
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_interpret_trailers always encodes/decodes using cls.default_encoding (UTF-8) even when the repo/commit uses a different i18n.commitencoding. This can break trailer handling for commits whose bytes are not valid UTF-8 (e.g., ISO-8859-1), especially if callers pass bytes. Recommend adding an explicit encoding parameter (defaulting to the repo’s configured commit encoding) and using it consistently for both .encode() and .decode().

Copilot uses AI. Check for mistakes.
stdout_bytes, _ = proc.communicate(str(message).encode())
finalize_process(proc)
message = stdout_bytes.decode("utf8")
message = cls._interpret_trailers(repo, str(message), trailer_args)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_from_tree has already computed conf_encoding from i18n.commitencoding, but trailer application calls _interpret_trailers without using it. To keep trailer I/O aligned with how the commit will be serialized, pass conf_encoding (or have _interpret_trailers derive it from the repo) when applying trailers.

Suggested change
message = cls._interpret_trailers(repo, str(message), trailer_args)
message = cls._interpret_trailers(
repo,
str(message),
trailer_args,
conf_encoding=conf_encoding,
)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

stdout_bytes, _ = proc.communicate(str(message).encode())
finalize_process(proc)
message = stdout_bytes.decode("utf8")
message = cls._interpret_trailers(repo, str(message), trailer_args)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In create_from_tree(), trailers are applied via _interpret_trailers() without passing the repo-configured commit encoding (conf_encoding). This means interpret-trailers I/O is always UTF-8 even when the commit will be serialized using a different encoding, which is inconsistent with trailers_list() and can produce different on-disk bytes. Pass encoding=conf_encoding when calling _interpret_trailers here (and avoid double str() conversion if possible).

Suggested change
message = cls._interpret_trailers(repo, str(message), trailer_args)
if not isinstance(message, str):
message = str(message)
message = cls._interpret_trailers(repo, message, trailer_args, encoding=conf_encoding)

Copilot uses AI. Check for mistakes.
Comment on lines +473 to +482
cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args]
proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload]
cmd,
as_process=True,
istream=PIPE,
)
message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict")
stdout_bytes, _ = proc.communicate(message_bytes)
finalize_process(proc)
return stdout_bytes.decode(encoding, errors="strict")
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_interpret_trailers() starts the subprocess before encoding message to bytes. If message.encode(...) raises (e.g., UnicodeEncodeError under a legacy commitencoding), the git process is left running until GC/AutoInterrupt finalization. Compute message_bytes before launching the process, or wrap subprocess lifetime in try/finally to ensure it is waited/terminated on all failure paths.

Suggested change
cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args]
proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload]
cmd,
as_process=True,
istream=PIPE,
)
message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict")
stdout_bytes, _ = proc.communicate(message_bytes)
finalize_process(proc)
return stdout_bytes.decode(encoding, errors="strict")
message_bytes = message if isinstance(message, bytes) else message.encode(encoding, errors="strict")
cmd = [repo.git.GIT_PYTHON_GIT_EXECUTABLE, "interpret-trailers", *trailer_args]
proc: Git.AutoInterrupt = repo.git.execute( # type: ignore[call-overload]
cmd,
as_process=True,
istream=PIPE,
)
try:
stdout_bytes, _ = proc.communicate(message_bytes)
return stdout_bytes.decode(encoding, errors="strict")
finally:
finalize_process(proc)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +480 to +484
try:
stdout_bytes, _ = proc.communicate(message_bytes)
return stdout_bytes.decode(encoding, errors="strict")
finally:
finalize_process(proc)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _interpret_trailers, proc.communicate() captures stderr but it’s discarded, and then finalize_process(proc) may raise GitCommandError without including the subprocess stderr (pipes may already be drained/closed). Capture stderr_bytes from communicate() and pass it into finalize_process(proc, stderr=stderr_bytes) so failures surface with a useful error message.

Suggested change
try:
stdout_bytes, _ = proc.communicate(message_bytes)
return stdout_bytes.decode(encoding, errors="strict")
finally:
finalize_process(proc)
stderr_bytes = None
try:
stdout_bytes, stderr_bytes = proc.communicate(message_bytes)
return stdout_bytes.decode(encoding, errors="strict")
finally:
finalize_process(proc, stderr=stderr_bytes)

Copilot uses AI. Check for mistakes.
stdout_bytes, _ = proc.communicate(str(message).encode())
finalize_process(proc)
message = stdout_bytes.decode("utf8")
message = cls._interpret_trailers(repo, str(message), trailer_args)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_from_tree reads i18n.commitencoding into conf_encoding, but the trailer application path calls _interpret_trailers(...) without passing that encoding (so it always encodes/decodes as UTF-8). To keep trailer I/O aligned with the configured commit encoding (and avoid byte-length differences affecting wrapping/formatting), pass encoding=conf_encoding when calling _interpret_trailers here.

Suggested change
message = cls._interpret_trailers(repo, str(message), trailer_args)
message = cls._interpret_trailers(repo, str(message), trailer_args, encoding=conf_encoding)

Copilot uses AI. Check for mistakes.
@Byron Byron merged commit 28c34c5 into main Apr 13, 2026
62 of 63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants