The linked file contains a (somewhat) broken xref table.
The xref table starts with
xref
1 7
when it should be
xref
0 7
According to the Pdf-Spec, Version 1.7:
Quote:
For a file that has never been incrementally updated, the cross-reference section shall contain only one subsection, whose object numbering begins at 0.
When PdfSharp reads the rest of the file, this leads to references inside the pdf pointing to the wrong objects.
A possible fix may be forcing PdfSharp to always start with object number 0, regardless of the number given in the xref table.
A place to fix this is in PdfSharp.Pdf.IO.Parser.cs
There is a Method called "ReadXRefTableAndTrailer".
Replace the complete method with this code:
(Note: i used version 1.32)
Code:
PdfTrailer ReadXRefTableAndTrailer(PdfReferenceTable xrefTable)
{
Debug.Assert(xrefTable != null);
Symbol symbol = ScanNextToken();
// Is it an xref stream?
if (symbol == Symbol.Integer)
throw new PdfReaderException(PSSR.CannotHandleXRefStreams);
// TODO: It is very high on the todo list, but still undone
Debug.Assert(symbol == Symbol.XRef);
var firstSection = true;
while (true)
{
symbol = ScanNextToken();
if (symbol == Symbol.Integer)
{
int start = this.lexer.TokenToInteger;
var firstId = start;
if (firstSection)
firstId = 0; // first section (that may be the one and only section) shall start with object number 0 (Pdf Reference 1.7, 7.5.4)
int length = ReadInteger();
var lastId = firstId + length;
for (int id = firstId; id < lastId; id++)
{
int position = ReadInteger();
int generation = ReadInteger();
ReadSymbol(Symbol.Keyword);
string token = lexer.Token;
if (id == 0 && firstSection)
{
if (token == "f") // first entry is a free entry, ok
continue;
else
{
id = start; // free entry is missing, use given object id (should we throw in this case ?)
lastId = start + length;
}
}
// Skip unused entries.
if (token != "n")
continue;
// Even it is restricted, an object can exists in more than one subsection.
// (PDF Reference Implementation Notes 15).
PdfObjectID objectID = new PdfObjectID(id, generation);
// Ignore the latter one
if (xrefTable.Contains(objectID))
continue;
xrefTable.Add(new PdfReference(objectID, position + lexer.StartOffset));
}
firstSection = false;
}
else if (symbol == Symbol.Trailer)
{
ReadSymbol(Symbol.BeginDictionary);
PdfTrailer trailer = new PdfTrailer(this.document);
this.ReadDictionary(trailer, false);
return trailer;
}
else
throw new PdfReaderException(PSSR.UnexpectedToken(this.lexer.Token));
}
}
With these changes, i was able to open the linked pdf.
Maybe the PdfSharp team has better/additional ideas to fix this ?
(well, it's actually not a fix for something, it just makes the library more tolerant against errors originating elsewhere)