Skip to content

Fix PDF spec compliance, git line-ending corruption, and add xref validation test#691

Merged
Shane32 merged 9 commits intomasterfrom
copilot/fix-pdf-creation-issue
Feb 27, 2026
Merged

Fix PDF spec compliance, git line-ending corruption, and add xref validation test#691
Shane32 merged 9 commits intomasterfrom
copilot/fix-pdf-creation-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 21, 2026

The PDF renderer produced invalid files in two ways: the /Kids array embedded an inline page dictionary (violating the PDF spec, which requires indirect references), and writer.WriteLine() emitted platform-specific line endings causing xref byte offsets to be wrong when git normalized line endings on checkout.

PDF structure fix

/Kids must contain only indirect object references. Restructured from 3 objects to 4:

  • Object 2 (Pages): /Kids [ 3 0 R ] — indirect reference instead of inline dict
  • Object 3 (Page): new indirect Page object with /Contents 4 0 R
  • Object 4 (Content stream): renumbered from 3→4; length now uses Encoding.ASCII.GetByteCount(content)

Cross-platform line endings

writer.WriteLine() emits Environment.NewLine, which is \n on Linux and \r\n on Windows. Since all other PDF output uses explicit \r\n, replaced with writer.Write("\r\n") for consistent output everywhere.

Git binary attribute

.gitattributes had * text=auto with .pdf only marked diff=astextplain — not binary. Git was stripping \r from PDF files on Linux checkout, making xref offsets wrong by 1 byte per newline. Changed to:

*.pdf  binary
*.PDF  binary

Approved snapshots

All 6 approved PDF snapshots regenerated. They now have consistent CRLF endings and correct xref offsets on all platforms.

Xref validation test

Added pdf_xref_table_is_valid to PdfByteQRCodeRendererTests — generates a PDF, locates startxref from the end, parses the full xref table, and asserts that every in-use entry's byte offset points to the correct N 0 obj line.

Original prompt

This section details on the original issue you should resolve

<issue_title>Fix PDF creation: /Kids must contain indirect references</issue_title>
<issue_description>Per CodeRabbit:

⚠️ Potential issue | 🔴 Critical

PDF Pages/Kids must reference an indirect Page object (not an inline dictionary).

Kids arrays shall contain indirect references to Page objects. The current inline page dictionary under /Kids violates the spec and may break validators/readers. Create a separate Page object and shift the content stream to the next object.

Apply this minimal restructuring:

-            "/Kids [ <<\r\n" +                                                      // Array of page objects - begin inline page dictionary
-                "/Type /Page\r\n" +                                                 // Declares this as a page
-                "/Parent 2 0 R\r\n" +                                               // References parent Pages object
-                "/MediaBox [0 0 " + pdfMediaSize + " " + pdfMediaSize + "]\r\n" +   // Page dimensions [x1 y1 x2 y2]
-                "/Resources << /ProcSet [ /PDF ] >>\r\n" +                          // Required resources: PDF operations only (no images)
-                "/Contents 3 0 R\r\n" +                                             // References content stream (object 3)
-                ">> ]\r\n" +                                                        // End inline page dictionary and Kids array
+            "/Kids [ 3 0 R ]\r\n" +                                                 // Kids must be indirect Page refs
             ">>\r\n" +
             "endobj\r\n"
         );
 
-        // Content stream - PDF drawing instructions
+        // Content stream - PDF drawing instructions
         var scale = ToStr(imgSize * 72 / (float)dpi / moduleCount);
         var pathCommands = CreatePathFromModules();
         var content = "q\r\n" +
           ...
           "Q";
 
-        writer.Flush();
-        xrefs.Add(stream.Position);
-
-        // Object 3: Content stream - contains the drawing instructions
-        writer.Write(
-            ToStr(xrefs.Count) + " 0 obj\r\n" +
-            "<< /Length " + ToStr(content.Length) + " >>\r\n" +
-            "stream\r\n" +
-            content + "endstream\r\n" +
-            "endobj\r\n"
-        );
+        // Object 3: Page (must be indirect)
+        writer.Flush();
+        xrefs.Add(stream.Position);
+        writer.Write(
+            ToStr(xrefs.Count) + " 0 obj\r\n" +
+            "<< /Type /Page\r\n" +
+            "/Parent 2 0 R\r\n" +
+            "/MediaBox [0 0 " + pdfMediaSize + " " + pdfMediaSize + "]\r\n" +
+            "/Resources << /ProcSet [ /PDF ] >>\r\n" +
+            "/Contents 4 0 R\r\n" +
+            ">>\r\n" +
+            "endobj\r\n"
+        );
+
+        // Object 4: Content stream - contains the drawing instructions
+        writer.Flush();
+        xrefs.Add(stream.Position);
+        writer.Write(
+            ToStr(xrefs.Count) + " 0 obj\r\n" +
+            "<< /Length " + ToStr(Encoding.ASCII.GetByteCount(content)) + " >>\r\n" +
+            "stream\r\n" +
+            content + "endstream\r\n" +
+            "endobj\r\n"
+        );

Also applies to: 117-124

🤖 Prompt for AI Agents
</issue_description>

<agent_instructions>Confirm this problem and resolution is correct and apply an appropriate fix if necessary. Do not necessarily use the suggested code unless it is correct. Ensure tests pass and/or fix them as necessary.</agent_instructions>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: Shane32 <6377684+Shane32@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix PDF creation to ensure Kids references are indirect Fix PDF creation: /Kids must contain indirect Page references Feb 21, 2026
Copilot AI requested a review from Shane32 February 21, 2026 15:44
Copilot AI and others added 2 commits February 21, 2026 16:42
Co-authored-by: Shane32 <6377684+Shane32@users.noreply.github.com>
Co-authored-by: Shane32 <6377684+Shane32@users.noreply.github.com>
Copilot AI changed the title Fix PDF creation: /Kids must contain indirect Page references Fix PDF spec compliance, git line-ending corruption, and add xref validation test Feb 21, 2026
Copilot AI requested a review from Shane32 February 21, 2026 16:47
@Shane32 Shane32 marked this pull request as ready for review February 21, 2026 17:10
@Shane32 Shane32 requested a review from gfoidl February 21, 2026 17:17
Copy link
Copy Markdown
Collaborator

@gfoidl gfoidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@Shane32 Shane32 merged commit 4973174 into master Feb 27, 2026
7 checks passed
@Shane32 Shane32 deleted the copilot/fix-pdf-creation-issue branch February 27, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix PDF creation: /Kids must contain indirect references

3 participants