PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

PdfDocument.PageCount throw exception
https://forum.pdfsharp.net/viewtopic.php?f=2&t=3524
Page 1 of 1

Author:  rajarul [ Thu Jan 12, 2017 7:21 am ]
Post subject:  PdfDocument.PageCount throw exception

Hi PdfSharp Team,

Below code wrote to reading existing PDF file and Get page count.

using (PdfDocument InputDocument = PdfReader.Open(PDFPath, PdfDocumentOpenMode.Import))
{
if (InputDocument != null && InputDocument.Pages.Count != 0)
_ActualPageCount = InputDocument.PageCount;
}

This code executed with out any issue, but for particular PDF file throw error at line "InputDocument.Pages.Count" message with " Object reference not set to an instance".

Kindly need your help to fix this problem.

Author:  rajarul [ Thu Jan 12, 2017 7:49 am ]
Post subject:  PdfDocument.PageCount throw exception

Hi PdfSharp Team,

Below code wrote to reading existing PDF file and Get page count.

using (PdfDocument InputDocument = PdfReader.Open(PDFPath, PdfDocumentOpenMode.Import))
{
if (InputDocument != null && InputDocument.Pages.Count != 0)
_ActualPageCount = InputDocument.PageCount;
}

This code executed with out any issue, but for particular PDF file throw error at line "InputDocument.Pages.Count" message with " Object reference not set to an instance".

Kindly need your help to fix this problem.

Thanks and Regards,

Raj.A

Author:  Thomas Hoevel [ Thu Jan 12, 2017 9:19 am ]
Post subject:  Re: PdfDocument.PageCount throw exception

rajarul wrote:
This code executed with out any issue, but for particular PDF file throw error at line "InputDocument.Pages.Count" message with " Object reference not set to an instance".
We need the PDF file to replicate the issue.

See also:
viewtopic.php?f=2&t=832

Author:  rajarul [ Fri Jan 13, 2017 4:32 am ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi PdfSharpTeam,

I attached PDF file to replicated this issue.Kindly let me know,if you have any difficulty.


Thanks and Regards,

Raj.A

Attachments:
SWINX.zip [14.27 KiB]
Downloaded 405 times

Author:  Thomas Hoevel [ Mon Jan 16, 2017 4:48 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi, Raj.A!
rajarul wrote:
I attached PDF file to replicated this issue.Kindly let me know,if you have any difficulty.
You attached a couple of HTML files, but no PDF.

Author:  rajarul [ Wed Jan 18, 2017 7:44 am ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi Thomas,

My apologies for this mistake.

Now you can download from from URL :https://github.com/rajarul82/SamplePDF/blob/master/adlibris.pdf.Since file size not allow to attach.

Thanks and Regards,

Raj.A

Author:  Thomas Hoevel [ Wed Jan 18, 2017 12:19 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi, Raj.A,

Thanks for the file.
Something goes wrong when PDFsharp tries to open that PDF file. I don't know yet what goes wrong or how to fix it.

Author:  rajarul [ Thu Jan 19, 2017 2:19 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi Thomas,

Any updated regards this issue.

Thanks and Regards,

Raj.A

Author:  st0smith [ Wed Jun 21, 2017 5:03 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi:

I am also receiving an object error on the PageCount property, on several pdf's that FoxIt can easily open. So far, our solution has been: Open with FoxIt, save to another file, then upload the file again. However, the users are getting restless.

Thanks,
Steve

Author:  (void) [ Thu Jun 22, 2017 7:29 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

The linked file contains a (somewhat) broken xref table.
The xref table starts with
xref
1 7
when it should be
xref
0 7

According to the Pdf-Spec, Version 1.7:
Quote:
For a file that has never been incrementally updated, the cross-reference section shall contain only one subsection, whose object numbering begins at 0.


When PdfSharp reads the rest of the file, this leads to references inside the pdf pointing to the wrong objects.

A possible fix may be forcing PdfSharp to always start with object number 0, regardless of the number given in the xref table.

A place to fix this is in PdfSharp.Pdf.IO.Parser.cs
There is a Method called "ReadXRefTableAndTrailer".
Replace the complete method with this code:
(Note: i used version 1.32)
Code:
        PdfTrailer ReadXRefTableAndTrailer(PdfReferenceTable xrefTable)
        {
            Debug.Assert(xrefTable != null);

            Symbol symbol = ScanNextToken();
            // Is it an xref stream?
            if (symbol == Symbol.Integer)
                throw new PdfReaderException(PSSR.CannotHandleXRefStreams);
            // TODO: It is very high on the todo list, but still undone
            Debug.Assert(symbol == Symbol.XRef);
            var firstSection = true;
            while (true)
            {
                symbol = ScanNextToken();
                if (symbol == Symbol.Integer)
                {
                    int start = this.lexer.TokenToInteger;
                    var firstId = start;
                    if (firstSection)
                        firstId = 0;            // first section (that may be the one and only section) shall start with object number 0 (Pdf Reference 1.7, 7.5.4)
                    int length = ReadInteger();
                    var lastId = firstId + length;
                    for (int id = firstId; id < lastId; id++)
                    {
                        int position = ReadInteger();
                        int generation = ReadInteger();
                        ReadSymbol(Symbol.Keyword);
                        string token = lexer.Token;
                        if (id == 0 && firstSection)
                        {
                            if (token == "f")           // first entry is a free entry, ok
                                continue;
                            else
                            {
                                id = start;             // free entry is missing, use given object id (should we throw in this case ?)
                                lastId = start + length;
                            }
                        }
                        // Skip unused entries.
                        if (token != "n")
                            continue;
                        // Even it is restricted, an object can exists in more than one subsection.
                        // (PDF Reference Implementation Notes 15).
                        PdfObjectID objectID = new PdfObjectID(id, generation);
                        // Ignore the latter one
                        if (xrefTable.Contains(objectID))
                            continue;
                        xrefTable.Add(new PdfReference(objectID, position + lexer.StartOffset));
                    }
                    firstSection = false;
                }
                else if (symbol == Symbol.Trailer)
                {
                    ReadSymbol(Symbol.BeginDictionary);
                    PdfTrailer trailer = new PdfTrailer(this.document);
                    this.ReadDictionary(trailer, false);
                    return trailer;
                }
                else
                    throw new PdfReaderException(PSSR.UnexpectedToken(this.lexer.Token));
            }
        }

With these changes, i was able to open the linked pdf.

Maybe the PdfSharp team has better/additional ideas to fix this ?
(well, it's actually not a fix for something, it just makes the library more tolerant against errors originating elsewhere)

Author:  st0smith [ Mon Jun 26, 2017 4:11 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hey, thanks (void). This fix seems to work very well for version 1.32. Unfortunately, I need to use version 1.5, and it doesn't work there.

Author:  rajarul [ Tue Jun 27, 2017 12:55 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Thanks for your support, But I had used some other library to handle this issue.

Author:  JimboBaggins [ Wed Feb 14, 2018 10:46 am ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi folks.

I am currently using PDf Sharp GDI v1.50.4619-beta4c and am also experiencing this issue with some PDFs that come through in our live environment. I have tested with PDF provided by Rajarul above and the client PDF I am having issues with. Same result for each.

Has there been a resolution to this? At this stage it looks like using another library might be a solution, but I don't want to take that step if possible as it will require significant additional time, testing and energy. I like PDF Sharp wish to remain using it.

Any help/update would be appreciated, so I can plan the appropriate steps.

Thanks

Author:  Thomas Hoevel [ Wed Feb 14, 2018 12:04 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi!
JimboBaggins wrote:
Has there been a resolution to this?
Could it be this exception occurs for "corrupted" files?
There are some pull request on GitHub that are meant to make PDFsharp more robust when dealing with corrupt files.
You can incorporate those pull requests into your local version to see if that solves the issue.

Author:  JimboBaggins [ Thu Feb 15, 2018 12:12 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

Hi Thomas,

Thank you for the response. I am not sure that the files are corrupt, I have tried the suggestion on this page (viewtopic.php?p=5758#p5758) from PDF Sharp Team (albeit over five years old at this stage) regarding opening from Adobe Reader and it opens the files fine.

Neither wants to save on close, which maybe indicates a deviation from standards.

I will take a look at the pull requests. Thanks

Author:  Thomas Hoevel [ Thu Feb 15, 2018 3:51 pm ]
Post subject:  Re: PdfDocument.PageCount throw exception

JimboBaggins wrote:
Neither wants to save on close, which maybe indicates a deviation from standards.
Adobe Reader fixes many issues when you invoke File => Save As, even if it does not prompt to save the file.
Most likely the file saved from Reader will work with PDFsharp.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/