PDFsharp & MigraDoc Forum
https://forum.pdfsharp.net/

Problem with some PDF files
https://forum.pdfsharp.net/viewtopic.php?f=2&t=2699
Page 1 of 1

Author:  laroom [ Fri Jan 03, 2014 1:42 pm ]
Post subject:  Problem with some PDF files

I have a PDF file which doesn't behave well with PDF sharp.
It seems to lack "Kids" and causes an exception in PdfPages.GetKids(...)

Code:
        PdfArray kids = kid.Elements["/Kids"] as PdfArray;
        //newTHHO 15.10.2007 begin
        if (kids == null)
        {
          PdfReference xref3 = kid.Elements["/Kids"] as PdfReference;
          kids = xref3.Value as PdfArray; // <--- Null reference here as kid.Elements does not contain tag "/Kids"
        }


Attached a sample project including the "faulty" PDF file.
Is there possible to make PDFSharp tolerate this?



Attachments:
ConcatenateDocuments.zip [41.51 KiB]
Downloaded 1047 times

Author:  () => true [ Fri Jan 03, 2014 3:24 pm ]
Post subject:  Re: Problem with some PDF files

laroom wrote:
Is there possible to make PDFSharp tolerate this?
We'll check this after the holidays.

Author:  Thomas Hoevel [ Mon Jan 13, 2014 5:51 pm ]
Post subject:  Re: Problem with some PDF files

Hi!

I can replicate the exception, but cannot fix it. I will ask Stefan.

Author:  laroom [ Wed May 14, 2014 1:56 pm ]
Post subject:  Re: Problem with some PDF files

Any news on this issue?

Author:  kensands [ Thu Oct 16, 2014 1:50 pm ]
Post subject:  Re: Problem with some PDF files

I had this problem and fixed it by just making an empty array when there is no kids, seems to work for me.

Code:
 
        if (xref3 == null)
          {
              kids = new PdfArray();
          }
          else
          {
          kids = xref3.Value as PdfArray;
          }


Full edited function shown below

Code:
    PdfDictionary[] GetKids(PdfReference iref, PdfPage.InheritedValues values, PdfDictionary parent)
    {
      // TODO: inherit inheritable keys...
      PdfDictionary kid = (PdfDictionary)iref.Value;

      if (kid.Elements.GetName(Keys.Type) == "/Page")
      {
        PdfPage.InheritValues(kid, values);
        return new PdfDictionary[] { kid };
      }
      else
      {
        Debug.Assert(kid.Elements.GetName(Keys.Type) == "/Pages");
        PdfPage.InheritValues(kid, ref values);
        List<PdfDictionary> list = new List<PdfDictionary>();
        PdfArray kids = kid.Elements["/Kids"] as PdfArray;
        //newTHHO 15.10.2007 begin
        if (kids == null)
        {
          PdfReference xref3 = kid.Elements["/Kids"] as PdfReference;

          if (xref3 == null)
          {
              kids = new PdfArray();
          }
          else
          {
          kids = xref3.Value as PdfArray;
          }

        }
        //newTHHO 15.10.2007 end
        foreach (PdfReference xref2 in kids)
          list.AddRange(GetKids(xref2, values, kid));
        int count = list.Count;
        Debug.Assert(count == kid.Elements.GetInteger("/Count"));
        //return (PdfDictionary[])list.ToArray(typeof(PdfDictionary));
        return list.ToArray();
      }
    }



ps - pdfsharp is fantastic, I use a combo of it and the google pdfium code (for rendering and text extraction) to do all my pdf systems now. Happy to contribute where possible.

Author:  Thomas Hoevel [ Wed Oct 22, 2014 11:51 am ]
Post subject:  Re: Problem with some PDF files

kensands wrote:
I had this problem and fixed it by just making an empty array when there is no kids, seems to work for me.
Thanks for the feedback. I'm afraid the fix does not work for the PDF file given in the first post.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/