Hi,
I use PdfSharp to automate the processing of PDF documents. For some of them, "PdfReader.Open" fails with the following exception:
Code:
InvalidOperationException "Object already in table."
I cannot share the document, but I debugged the issue and found the following one-liner as fix:
Code:
diff --git a/src/foundation/src/PDFsharp/src/PdfSharp/Pdf.Advanced/PdfCrossReferenceTable.cs b/src/foundation/src/PDFsharp/src/PdfSharp/Pdf.Advanced/PdfCrossReferenceTable.cs
index 7c18106..89c5e58 100644
--- a/src/foundation/src/PDFsharp/src/PdfSharp/Pdf.Advanced/PdfCrossReferenceTable.cs
+++ b/src/foundation/src/PDFsharp/src/PdfSharp/Pdf.Advanced/PdfCrossReferenceTable.cs
@@ -60,6 +60,7 @@ public void Add(PdfReference iref)
#endif
}
ObjectTable.Add(iref.ObjectID, iref);
+ MaxObjectNumber = Math.Max(MaxObjectNumber, iref.ObjectID.ObjectNumber);
}
/// <summary>
The background is that for some PDF files,
Code:
Add(PdfReference)
is called and adds objects to the ObjectTable, but MaxObjectNumber is not updated. When
Code:
Add(PdfObject)
is called for another object later on, and it does not have an ID, MaxObjectNumber is used, causing a collision and the exception.
I did a minimal fix that works for me, but the design is fragile. It is easy to call Add() without updating MaxObjectNumber. An abstraction layer could take care of this, but someone familiar with the code should better do this. Maybe the quick fix can be merged in the meantime?
I can also provide the fix as a git commit, but I did not find how to contribute this way on the home page of the project.