PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Decode a page with Filters: /FlateDecode and /DCTDecode
https://forum.pdfsharp.net/viewtopic.php?f=3&t=882
Page 1 of 1

Author:  laurentl [ Wed Sep 23, 2009 12:21 pm ]
Post subject:  Decode a page with Filters: /FlateDecode and /DCTDecode

Hello,

I am using pdfsharp to generate a single bitmap of a pdf page. My pdf are the output of an ocr process from tiff files (using the abbyy recognition server).
I tried the ExportImage project and I really like it, it works fine and it is very fast to render the image.
I think that I found a bug when I am trying to generate a bitmap for a pdf page having the two filters: FlateDecode and DCTDecode.

In the public static byte[] Decode(byte[] data, PdfItem filterItem) method, it applies first the FlatDecode filter and then tries the DCTDecode. As the DCTDecode does not do anything, it returns null. I guess that it should then return the decoding result of the FlatDecode?

Here is what I changed:

before:
data = Filtering.Decode(data, item);

changed to:
data = Filtering.Decode(data, item) ?? data;

What do you think?


Thanks

Laurent

Author:  Thomas Hoevel [ Wed Sep 23, 2009 12:34 pm ]
Post subject:  Re: Decode a page with Filters: /FlateDecode and /DCTDecode

Hi!
laurentl wrote:
I think that I found a bug when I am trying to generate a bitmap for a pdf page having the two filters: FlateDecode and DCTDecode.

Is this a bug in PDFsharp or a bug in that PDF file?

It can't hurt to make PDFsharp more robust for document errors, but I'd like to know if that page still has both filters when it was opened and saved with Adobe Acrobat (not the Reader).

I'm afraid that ?? is not compatible with VS 2005, so I'd use an if instead.
I'll inform our experts about your proposal.

Author:  laurentl [ Wed Sep 23, 2009 12:56 pm ]
Post subject:  Re: Decode a page with Filters: /FlateDecode and /DCTDecode

Hi Thomas,

I think that the bug is in PDFSharp in the line I made the change but I cannot be 100% sure. When I load the document with the Modify open mode, I also get this message:
Number of deleted unreachable objects XX.

I just tried to add some text to the page and save it. Unfortunately when I try then, I get an exception when trying to open the file: "Cannot handle iref streams. The current implementation of PDFsharp cannot handle this PDF feature introduced with Acrobat 6."
I then opened the file again with acrobat and in the EnFocus inspector tool and I can see that the page still has JPEG + ZIP filters.

Laurent

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/