PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 1:38 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 15 posts ] 
Author Message
PostPosted: Sat Jul 31, 2010 5:16 pm 
Offline

Joined: Sat Jul 31, 2010 2:20 pm
Posts: 4
I am having a problem creating a pdf that contains a jpg image, only on Windows 7. The c# program works fine under XP and Vista. The result in Windows 7 is a barely recognizable image. I wrote a short program to demonstrate the problem and it simply draws a jpg image on a pdf document page.

The program was created in the following enviroment is:
32 bit Vista Home with .net framework 3.5 using Visual C# 2008 express edition.
The source jpg is 59k. The resultant pdf is 62.6k and looks OK.

The Windows 7 enviroment is:
64 bit Windows 7 Professional with .net framework 4.0.
When the same program is run with the same jpg, the resultant pdf is 211k and looks terrible.

Also relevant is:
Multiple jpg files fail.
All the failing jpg files had been resized, converted to greyscale and brightness/contrast tweaked (with IrfanView).
The original jpg works fine in XP, Vista AND Windows 7.
Multiple pdf viewers had the same results - ok under XP or Vista, bad under Windows 7.

Source code:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.IO;
using PdfSharp;
using PdfSharp.Drawing;
using PdfSharp.Pdf;
//using PdfSharp.Pdf.IO;

namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

private void button1_Click(object sender, EventArgs e)
{
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
Image jpgImage;
jpgImage = Image.FromFile(openFileDialog1.FileName);
// Create a new PDF document
PdfDocument document = new PdfDocument();

// Create an empty page
PdfPage page = document.AddPage();
page.Orientation = PageOrientation.Landscape; //both landscape and portrait fail

// Get an XGraphics object for drawing
XGraphics gfx = XGraphics.FromPdfPage(page);

// Draw jpg
gfx.DrawImage(jpgImage, 0, 0, page.Width, page.Height);

// Save the document...
const string filename = "c:\\temp\\PDF image test.pdf";
document.Save(filename);
}
}
}
}

I have included the source jpg file, the pdf file created under Vista and the pdf file created under Windows 7.

Note- I had error messages when trying to upload the files...


Attachments:
401429.jpg
401429.jpg [ 59.09 KiB | Viewed 27672 times ]
Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 08, 2010 8:40 am 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 339
PeteW wrote:
Note- I had error messages when trying to upload the files...

You can upload ZIP files containing PDF.

I'll have a look after our holidays and after switching my computer to Windows 7.
Sounds like a Microsoft compatibility problem (PDFsharp relies on the framework/OS to decode JPEGs), but maybe we can fix it or find a workaround.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 19, 2011 1:05 pm 
Offline

Joined: Sat Feb 19, 2011 1:02 pm
Posts: 1
Hi,

I'm having the same issue as descriped here.

Windows 7, 64bit

pdfsharp standard build all, don't know if it's GDI+ og WPF.

Have you had any luck with solving this problem?

/Timeruler


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 21, 2011 5:00 pm 
Offline

Joined: Sat Jul 31, 2010 2:20 pm
Posts: 4
I have done further problem determination and the results are:
1. Any jpg that has been edited using the Convert to Grayscale function in IrfanView (an excellent jpg viewer/editor) has the problem when used with the PDFsharp drawimage function. Editing with other programs such as Photoshop, Picassa, etc is OK.
2. Failing jpg's display OK in Word, Picassa, Photoshop and all other viewers I've tried. PDFsharp is the only problem.
3. Within C#, a failing jpg displays OK with drawimage to a Picturebox object. Also works OK with a drawimage to a PrintDocument object, including PrintPreview. Does PDFsharp use that same drawimage function in .NET?

Summary
1. The IrfanView Convert to Grayscale option is what creates the incompatible jpg.
2. Only PDFsharp has a problem with this jpg, and only in Windows 7 (XP & Vista are ok).


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2011 8:26 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
PeteW wrote:
2. Only PDFsharp has a problem with this jpg, and only in Windows 7 (XP & Vista are ok).

PDFsharp uses framework functions to open the images. So it shouldn't be a problem of only PDFsharp.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2011 9:16 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
The GDI+ uses the following line to get the JPEG "bytes" for an image:
Code:
image.gdiImage.Save(memory, ImageFormat.Jpeg);

Under XP this return 60 kB - the original file.
Under Windows 7 this returns about 240 kB - the distorted image you see in the PDF file, not the original image.

The WPF build doesn't work either (but I haven't stepped through the code yet).

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Sat Mar 26, 2011 1:46 pm 
Offline

Joined: Sat Jul 31, 2010 2:20 pm
Posts: 4
Thanks for the update. I didn't mean only PDFsharp had a problem. The point I was making is that other external programs and similar functions within C# interpret the image correctly. I am not saying that PDFsharp is the cause of the failure - it may be IrfanView or Framework or C#. However, I think the correction or circumvention would have to be within PDFsharp.

The circumvention is to use an editing program other than IrfanView. I have had no problems when using the similar, but not identical, convert to black/white in Picasa.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 28, 2011 7:53 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
PeteW wrote:
However, I think the correction or circumvention would have to be within PDFsharp.

The problem is part of the Windows OS, so it's Microsoft's job to fix it.
We can only try to change PDFsharp to work around that problem (which is the second-best option).

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 18, 2011 7:11 pm 
Offline

Joined: Mon Apr 18, 2011 4:50 pm
Posts: 2
I think the problem is to be found in the method:

PdfSharp.Pdf.Advanced.PdfImage.cs InitializeJpeg()

In the code the graphic that was previously loaded in memory is being saved to a memory stream object:

Code:
memory = new MemoryStream();
image.gdiImage.Save(memory, ImageFormat.Jpeg);


This Save call is actually converting the graphic to a color image. I checked this by adding another save line right after the original one:

Code:
image.gdiImage.Save(@"c:\temp\fromDotNet.jpg");


When I re-ran the application with this new graphic (instead of the original one) the graphic's properties are changed. So Where the code sets the ColorSpace to /DeviceGray for the original value,
Code:
Elements[Keys.ColorSpace] = new PdfName("/DeviceGray");


Running the code on the updated image executes the /DeviceRGB command.
Code:
Elements[Keys.ColorSpace] = new PdfName("/DeviceRGB");


I stepped through the application, and manually called the code to record the ColorSpace as "/DeviceRGB" instead of "/DeviceGray" and the output document rendered correctly.

So I see various possible fixes for this issue:
    1) Add an encoder for the graphic that matches the original settings. - I have no idea what would be required to do this.

    2) Use the memory stream's saved settings - I suspect that this would require reloading the graphic, unless a specific encoder was used. I don't have enough experience with GDI to know if saving the file without any settings always results in the default encoder settings. If that is the case, that would explain why the BitsPerComponent is a static 8 instead of a variable 8,16,24, or 32. It maybe sufficient to merely set this to "/DeviceRGB"

    3) Save off a copy of the original file at the point it is loaded and do not re-save it. - I suspect that this may not be the answer for a few reasons, one it functionally doubles the amount of memory required for a graphic and it is very possible that a graphic may not be loaded from disk. In this situation, there would be no data available, and the original problem returns.

    4) Use the InitializeNonJpeg() or a variation. - I did not evaluate this code, as the JPG code appears sufficient except for mismatched properties.

By way of additional verification of this being the root cause, checking the values of the supplied graphic:
.PixelFormat = PixelFormat.Format8bppIndexed

Reloading the image (from the saved memory stream) gives this values:
.PixelFormat = PixelFormat.Format24bppRgb

So there definitely is a difference.

This may also explain why the "/Decode" for the ColorSpaceYcck needed to be inverted. I don't have any images of this type, but it may be worth rechecking if using another version.

I'm attaching a copy of the PdfImage.cs with option #2 - using a copy of the image from the saved memory stream. I haven't changed any of the elements to use the new "renderedImage" image object except those that cause the item to choose a different ColorSpace. Code is designated as "// NEW"

Unfortunately, I am working in VS 2005 so I cannot determine how to adjust the WPF code, if any change is needed.


Attachments:
File comment: An updated instance of the PdfImage.cs image class that reloads the image. This produces a correct image output for the example image provided in this thread.
PdfImage.zip [9.95 KiB]
Downloaded 846 times
Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 19, 2011 7:59 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
seraphire wrote:
I think the problem is to be found in the method:

Thank you very much for investigating that problem.

Under XP, Save gives us the original JPEG file.
Under Windows 7, Save gives us a file that is much bigger than the original file. If I understand your solution correctly, you found a way to display this much bigger file correctly while with the orignal PDFsharp code it doesn't show correctly.
But the big increase in file size remains.

Under XP it's so easy: we call Load, we call Save - and we get the file that was loaded.
Under Windows 7 this works for most files, but not for greyscale images created with IrfanView.

Maybe I can find a simple fix for the WPF build. Thanks again for your feedback.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 19, 2011 5:37 pm 
Offline

Joined: Mon Apr 18, 2011 4:50 pm
Posts: 2
I was looking into using an alternate form of the Save() method that would take encoding parameters to attempt to induce the Grayscale again. There appears to be an OS level bug in the EncoderParameters when attempting to retrieve them for instances such as JPEG.

(See: http://stackoverflow.com/questions/3152506/unexpected-bitmap-region-is-already-locked-exception-with-getencoderparameterli)

It may be possible to set them all manually, but I don't know enough about JPEG processing. But I did notice, as you pointed out, that the size of the file grew substantially.


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2013 12:04 pm 
Offline

Joined: Thu Feb 17, 2011 12:27 pm
Posts: 11
Thomas Hoevel wrote:
The GDI+ uses the following line to get the JPEG "bytes" for an image:
Code:
image.gdiImage.Save(memory, ImageFormat.Jpeg);

Under XP this return 60 kB - the original file.
Under Windows 7 this returns about 240 kB - the distorted image you see in the PDF file, not the original image.

The WPF build doesn't work either (but I haven't stepped through the code yet).


Please correct me if I'm wrong, but the easiest thing that can be done is to just copy original Jpeg contents (byte by byte) into the PDF.
So instead of this:
Code:
image.gdiImage.Save(memory, ImageFormat.Jpeg);

Do this:
Code:
if (!image.path.StartsWith("*"))
      {
         using (FileStream sourceFile = File.OpenRead(image.path))
         {
            int count = 0;
            byte[] buffer = new byte[1024];
            memory = new MemoryStream((int)sourceFile.Length);
            do
            {
               count = sourceFile.Read(buffer, 0, buffer.Length);
               // memory.Write(buffer, 0, buffer.Length);
               memory.Write(buffer, 0, count);
            }
            while (count > 0);

         }
      }
      else
      {
         memory = new MemoryStream();
         image.gdiImage.Save(memory, ImageFormat.Jpeg);
      }


This has some positive side effects - it preserves original file structure. Image.Save(stream, ImageFormat.Jpeg) causes re-compression and uses default compression settings. It may introduce compression artefacts and in case of well optimized Jpegs may result in larger file size when compared to original Jpeg saved on disk.

Update by Forum Moderator: Instead of "memory.Write(buffer, 0, buffer.Length)" use "memory.Write(buffer, 0, count)". Change is included in the code snippet above.


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2013 12:47 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
Hi!
SiliconMind wrote:
Please correct me if I'm wrong, but the easiest thing that can be done is to just copy original Jpeg contents (byte by byte) into the PDF.
Not all images have an original file. What to do with images from memory streams, resources, &c.?

Believe me: with Windows XP our method gets the original file if the file came from JPEG, no re-compression, no quality loss, no complication.
Everything worked fine until MS messed it up with Windows 7 - now it works fine except for some grayscale JPEG files.

It could be an option to check path first and use "our" method as a fallback strategy.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2013 1:20 pm 
Offline

Joined: Sat Jul 31, 2010 2:20 pm
Posts: 4
Thanks for the update and to all who investigated the problem. The original application stored the image(s) in a resource - no JPG file. Only the 'test' program used a file. Has anyone thought to try the program and circumventions with Win8?


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 26, 2013 7:41 am 
Offline

Joined: Thu Feb 17, 2011 12:27 pm
Posts: 11
Thomas Hoevel wrote:
Not all images have an original file. What to do with images from memory streams, resources, &c.?

As you can see in my snippet, for in memory images I use the original method.

As far as I remember (but I'm not 100% suer), Images created in memory will not be processed by InitializeJpeg, so the original issue does not affect them anyway.

The only issue remains with images red from streams and the reasonable fix I can think of is to save the stream to a file before creating XImage. But that should be done outside of PdfSharp lib.

Thomas Hoevel wrote:
Believe me: with Windows XP our method gets the original file if the file came from JPEG, no re-compression, no quality loss, no complication.

But this does not work correctly on other systems and I wouldn't expect a fix from MS any time soon. So a fix on our side is needed. I've ran some tests and when I used file copy instead of Image.Save() I got different PDF file sizes. Direct file copy created about 10% larger files than original Image.Save() method. It seems that Image.Save() introduces re-compression after all. I didn't have time to test the same code on XP, but let's face it - XP is slowly phasing out, and since there is no fix from MS...


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 15 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 37 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group