Skip to content

postprocess: support multiple transforms#1543

Open
thedataking wants to merge 5 commits intomasterfrom
perl/postprocess-multiple-transforms
Open

postprocess: support multiple transforms#1543
thedataking wants to merge 5 commits intomasterfrom
perl/postprocess-multiple-transforms

Conversation

@thedataking
Copy link
Copy Markdown
Contributor

Support multiple transforms such that we can support different post-processing steps such as:

  • comment transfer
  • assert folding
  • relooper cleanup
  • reformatting of large tables, etc.

Some transforms might also be broken down into a sequence of simpler transforms. A concrete use case is lua which contains license headers and other matter before function definitions which we do not want to pass so the comment transfer transform.

(Replaces PR #1512 as it was automatically closed when I renamed the branch)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the c2rust postprocessing system to support multiple transforms in sequence. The main goal is to enable different post-processing steps (like comment transfer, assert folding, relooper cleanup, etc.) to be applied independently and in configurable order.

Changes:

  • Introduced an abstract base class AbstractTransform to standardize transform implementations
  • Renamed CommentTransfer to CommentsTransform and moved it to a new module structure
  • Added factory function get_transform_by_id to instantiate transforms by name
  • Updated CLI to accept comma-separated list of transforms via --transforms parameter

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
c2rust-postprocess/postprocess/transforms/base.py New abstract base class defining transform interface with apply() and apply_dir() methods
c2rust-postprocess/postprocess/transforms/comments.py Refactored comments transform with renamed classes, added validation, and AbstractTransform inheritance
c2rust-postprocess/postprocess/transforms/init.py New factory function to instantiate transforms by ID
c2rust-postprocess/postprocess/comments/init.py Duplicate/incomplete implementation with critical bugs that appears to be leftover from refactoring
c2rust-postprocess/postprocess/init.py Updated main entry point to support multiple transforms and use factory pattern
c2rust-postprocess/tests/llm-cache/CommentsTransform/*/metadata.toml Updated cache metadata to reflect "CommentsTransform" naming
c2rust-postprocess/tests/llm-cache/CommentsTransform/*/response.txt New cached test responses for quickSort and partition functions
Comments suppressed due to low confidence (3)

c2rust-postprocess/postprocess/transforms/comments.py:69

  • The docstring parameter documentation is missing the colon prefix for reST/Sphinx format. It should be :param rust_fn: and :return: instead of param: and return:.
    c2rust-postprocess/postprocess/transforms/comments.py:242
  • The transfer_comments_dir method calls self.apply with a rust_source_file parameter, but the apply method signature only accepts root_rust_source_file. This mismatch will cause a TypeError at runtime. Either the apply method needs to be updated to accept rust_source_file, or this call needs to be updated to match the apply signature.
    c2rust-postprocess/postprocess/transforms/comments.py:242
  • Keyword argument 'rust_source_file' is not a supported parameter name of method CommentsTransform.apply.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/transforms/base.py Outdated
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/transforms/base.py
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/comments/__init__.py Outdated
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch from 828f00c to 79adc8c Compare January 10, 2026 03:35
@immunant immunant deleted a comment from Copilot AI Jan 10, 2026
@immunant immunant deleted a comment from Copilot AI Jan 10, 2026
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch 3 times, most recently from 3c35f8a to ade83ec Compare January 10, 2026 04:05
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch 2 times, most recently from d21f1eb to 111e4d5 Compare February 1, 2026 09:13
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch 3 times, most recently from 636dac3 to 34f08b2 Compare February 10, 2026 10:15
@thedataking thedataking marked this pull request as ready for review February 10, 2026 10:15
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch 3 times, most recently from 70e6a92 to 0f869ac Compare February 11, 2026 09:44
Copy link
Copy Markdown
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this PR does a lot of different things, which makes it harder to review altogether at once. Could we split this into a few separate PRs?

  • 8a3e399 seems totally independent. We can land that on its own without any issue.
  • 9595e66 and dde9bea are about supporting multiple transforms, which is what I think this PR was originally titled for.
  • 73069ee can be on its own, too, and there are parts of it that also overlap with the diffs in #1553 that I think we can better untangle if it's its own PR.
  • 00cb8ec and 0f869ac seem like their own transforms that can be their own PRs.

Also, I don't think the LLM cache is being fully updated on each commit, which makes it much harder to review, as then I can't fully see what that commit changes, as the cache is basically the postprocessor's snapshot tests.

Comment thread c2rust-postprocess/postprocess/transforms/__init__.py Outdated
Comment thread c2rust-postprocess/postprocess/__init__.py
Comment thread c2rust-postprocess/postprocess/transforms/base.py Outdated
Comment thread c2rust-postprocess/postprocess/transforms/base.py Outdated
Comment thread c2rust-postprocess/postprocess/transforms/comments.py
Comment on lines -45 to -49
/*
* Lomuto Partition Scheme:
* Partitions the array so that elements < pivot are on the left,
* and elements >= pivot are on the right.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this valid doc comment for partition?

Comment on lines -4 to -13
response = """/// Lomuto Partition Scheme:
/// Partitions the array so that elements < pivot are on the left,
/// and elements >= pivot are on the right.
response = """/// Partition the subarray around the last element as pivot and return pivot's final index.
#[no_mangle]
pub unsafe extern "C" fn partition(
mut arr: *mut ::core::ffi::c_int,
mut low: ::core::ffi::c_int,
mut high: ::core::ffi::c_int,
) -> ::core::ffi::c_int {
// Partition the subarray around the last element as pivot and return pivot's final index.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a worse translation. That comment is supposed to be inside the function, and now we're missing the comment on/outside the function.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to delete the previous version of this file (which is what #1554 makes easy to do) so that git will show this as a moved file with a diff. This makes it much easier to review the changes.


class AssertsTransform(AbstractTransform):
def __init__(self, cache: AbstractCache, model: AbstractGenerativeModel):
super().__init__(SYSTEM_INSTRUCTION)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making SYSTEM_INSTRUCTION an abstract method might be simpler and more idiomatic.

@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch from 0f869ac to bcf6795 Compare February 11, 2026 10:43
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch from bcf6795 to 836b725 Compare February 13, 2026 09:02
@thedataking thedataking changed the base branch from master to perl/cache-scope-arg February 13, 2026 09:09
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch from 836b725 to ec7002e Compare February 13, 2026 09:38
Base automatically changed from perl/cache-scope-arg to master February 13, 2026 09:48
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch 3 times, most recently from f07c3ff to 296bea4 Compare February 13, 2026 10:37
A later change adds an abstract base class using the `Transform` suffix
so it is more consistent with naming elsewhere in this project to call
the inaugural transform `CommentsTransform`. Updated cache dir
correspondingly.
Adds base class for transforms.
The C declarations can contain preprocessor definitions with their own
associated comments. Sometimes we also get license headers and similar
front matter. Use a separate transformation to remove this front matter
such that validation of comment insertion only take the relevant comments
into account.

Note: the trim step is currently tightly integrated into the comment
transfer step but I expect we'll want to run it before other transforms
as well. This can be handled in a follow-up change.
TODO: Read assert.h from /usr/include to better handle differences between
systems.
@thedataking thedataking force-pushed the perl/postprocess-multiple-transforms branch from 296bea4 to 46666a3 Compare April 5, 2026 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants