PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 8:42 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Wed Jan 29, 2014 6:14 pm 
Offline

Joined: Wed Jan 29, 2014 3:50 pm
Posts: 6
Hi,

I've got a PDF file that lead me to several conclusions and some questions:

When trying the delivered Booklet example first I've got an exeception in the PdfSharp.Pdf.Filters.Filter.RemoveWhiteSpace method and I'm convinced that it never worked as originally written. My suggested correction is here:
Code:
  protected byte[] RemoveWhiteSpace(byte[] data)
  {
    int count = data.Length;
    int j = 0;
    for (int i = 0; i < count; i++, j++)
    {
      switch (data[i])
      {
        case (byte)Chars.NUL:  // 0 Null
        case (byte)Chars.HT:   // 9 Tab
        case (byte)Chars.LF:   // 10 Line feed
        case (byte)Chars.FF:   // 12 Form feed
        case (byte)Chars.CR:   // 13 Carriage return
        case (byte)Chars.SP:   // 32 Space
          j--;
          break;
 
        default:
          if (i != j)
            data[j] = data[i];
          break;
      }
    }
    if (j < count) // MM correction
    {
      byte[] temp = data;
      data = new byte[j];
      for (int idx = 0; idx < j; idx++)
        data[idx] = temp[idx];
    }
    return data;
  }


After that I've got another exception, this time in the PdfSharp.Pdf.Filters.ASCIIHexDecode.Decode method, where originally was
Code:
count <<= 2;
but I'm convinced it should be
Code:
count /= 2;


After that the code has passed into the PdfSharp.SharpZipLib.Zip.Compression.Streams.InflaterInputStream.Read method where the inf.Inflate method throws the PdfSharp.SharpZipLib.SharpZipBaseException: Adler chksum doesn't match: 1526531981 vs. 91641681.

Unfortunately I'm not that familiar with the PDF content handling to be able to localize the cause of this issue. The fixes I've provided were only the obvious ones to make the code continue without exceptions.

I've got also yet another file (which I don't think I can expose here due to it's content) and with that I'm getting another exception at the same place: PdfSharp.SharpZipLib.SharpZipBaseException: broken uncompressed block. I don't know, why in this case the inf.Inflate method was called with the b {byte[32768]} array of zeroes.

I believe it all relates to this part of the core:
Code:
  // Import resources
  PdfItem res = importPage.Elements["/Resources"];
  if (res != null) // unlikely but possible

because actually all the files I have to process have this "unlikely" Element.


Could someone have a look at the attached file and using the delivered Booklet example to debug the core of PdfSharp. Any (especially prompt) response would be greatly appreciated.

Thank you in advance,
Mirek

P.S. I've tried to attach the pdf file, but this forum responses that the image file has invalid format...


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 30, 2014 9:13 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
miroslavmandl wrote:
P.S. I've tried to attach the pdf file, but this forum responses that the image file has invalid format...
You can attach ZIP files (up to 256 kiB IIRC). Larger files can be submitted via e-mail.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 30, 2014 9:24 am 
Offline

Joined: Wed Jan 29, 2014 3:50 pm
Posts: 6
OK. Here is the file.


Attachments:
SchHDPE_153 RAA.zip [41.8 KiB]
Downloaded 546 times
Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 04, 2014 1:27 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
Hi!

Yes, you found two bugs.

I got to the "Adler chksum doesn't match" error message, but cannot say whether this is a problem with the PDF file or a problem with the PDFsharp code.
Not my area of expertise. I'll ask Stefan to have a look at it.

I presume we didn't have any PDF files to test ASCIIHexDecode before. Since this filter doubles the size of the encoded data, there is no advantage in using it.
Difficult to investigate the "/Resources" problem without a sample file.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 06, 2014 8:59 am 
Offline

Joined: Wed Jan 29, 2014 3:50 pm
Posts: 6
Hi,

Regarding the "Adler chksum doesn't match" error I’d say that since the file can be displayed with Adobe Acrobat, I’d assume the issue might be in your code.
Regarding the "/Resources" problem you’ve got an example attached to this thread.

Fortunately I've managed to get the original files without the ASCIIHexDecode, so I'm no longer dependent on you fixing this part of your library and it’s totally up to you how much time you spend on investigation of the above issues.

Best regards,
Mirek


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: Baidu [Spider] and 61 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group