PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

OutOfMemoryException with ARGB images
https://forum.pdfsharp.net/viewtopic.php?f=3&t=1955
Page 1 of 1

Author:  SiliconMind [ Mon Mar 26, 2012 3:22 pm ]
Post subject:  OutOfMemoryException with ARGB images

I'm constantly hitting issues caused by "out of memory" exceptions. The exceptions come from the depths of PdfSharp lib and usually look like this:
Code:
   in PdfSharp.Pdf.Advanced.PdfImage.ReadTrueColorMemoryBitmap(Int32 components, Int32 bits, Boolean hasAlpha)
   in PdfSharp.Pdf.Advanced.PdfImage.InitializeNonJpeg()
   in PdfSharp.Pdf.Advanced.PdfImage..ctor(PdfDocument document, XImage image)
   in PdfSharp.Pdf.Advanced.PdfImageTable.GetImage(XImage image)
   in PdfSharp.Drawing.XForm.GetImageName(XImage image)
   in PdfSharp.Drawing.Pdf.XGraphicsPdfRenderer.GetImageName(XImage image)
   in PdfSharp.Drawing.Pdf.XGraphicsPdfRenderer.Realize(XImage image)
   in PdfSharp.Drawing.Pdf.XGraphicsPdfRenderer.DrawImage(XImage image, Double x, Double y, Double width, Double height)
   in PdfSharp.Drawing.XGraphics.DrawImage(XImage image, Double x, Double y, Double width, Double height)


This happens when I create a PDF file with number of large PNGs. Exceptions get thrown usually when I reach 25-30th PNG file. PNG files are 32bit, 4000x4000 pixels each.
I'm pretty sure that this is not a problem with amount of available memory. The process that gets those exceptions eats no more than 800 MB (private working set) on a 64bit machine with 4GB of ram. My PDF building code sometimes eats even more ram (up to 2GB) but these "out of memory" exceptions get thrown only when I'm using these large PNGs. I can put to PDF hundreds of JPEGs and everything works fine. But when I use large PNGs, I get exceptions almost instantly.

The other thing that intrigues me is very low speed of processing PNGs. Why adding 32-bit ARGB files takes so much time?

Author:  Thomas Hoevel [ Tue Mar 27, 2012 10:10 am ]
Post subject:  Re: OutOfMemoryException with ARGB images

SiliconMind wrote:
Why adding 32-bit ARGB files takes so much time?
PDFsharp adds masks for images with transparency (ARGB includes the alpha channel that specifies the opacity of the pixels).

If you don't need transparency, then use RGB bitmaps for faster processing.

ReadTrueColorMemoryBitmap reads the image data from the operating system and creates the byte arrays that will be included in the PDF.
So this is the place where out of memory exceptions occur when you only add images to the PDF.

Author:  SiliconMind [ Tue Mar 27, 2012 10:55 am ]
Post subject:  Re: OutOfMemoryException with ARGB images

Thomas Hoevel wrote:
PDFsharp adds masks for images with transparency (ARGB includes the alpha channel that specifies the opacity of the pixels).

If you don't need transparency, then use RGB bitmaps for faster processing.

I need transparency, so I can't use JPEGs. I've noticed that you're using MemoryBitmap to copy image's byte data into array and then you process that array. This is quite noticeable overhead - at one point your code needs three times the amount of memory required for the original bitmap (1 for System.Drawing.Bitmap, 2 for Memory stream to which you save the bitmap, 3 for byte array you use for processing). For large images this is really a lot of memory (3x 64MB per one image in my case). Why not just use Bitmap.LockBits?

Thomas Hoevel wrote:
ReadTrueColorMemoryBitmap reads the image data from the operating system and creates the byte arrays that will be included in the PDF.
So this is the place where out of memory exceptions occur when you only add images to the PDF.

Ok, but why the exception? There is clearly enough system memory available.

Author:  Thomas Hoevel [ Tue Mar 27, 2012 11:47 am ]
Post subject:  Re: OutOfMemoryException with ARGB images

SiliconMind wrote:
Why not just use Bitmap.LockBits?
Interesting idea, but currently we use BitmapImage internally and LockBits is not available. Must check how many follow-up changes this will require.

Author:  SiliconMind [ Tue Mar 27, 2012 12:19 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

Thomas Hoevel wrote:
SiliconMind wrote:
Why not just use Bitmap.LockBits?
Interesting idea, but currently we use BitmapImage internally and LockBits is not available. Must check how many follow-up changes this will require.

I think that using Bitmap.LockBits inside ReadTrueColorMemoryBitmap won't affect other code. Using LockBits would significantly reduce memory usage and speed up processing. There would be no need to create two additional copies of the image.

Anyway - I did some debugging. The exception is thrown when ReadTrueColorMemoryBitmap tries to allocate imageBits byte array on line 362. But I have no idea why.

Author:  Thomas Hoevel [ Tue Mar 27, 2012 1:31 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

SiliconMind wrote:
I think that using Bitmap.LockBits inside ReadTrueColorMemoryBitmap won't affect other code.
I hope I didn't miss something: We cannot currently call Bitmap.LockBits inside ReadTrueColorMemoryBitmap because we don't have a Bitmap there. What we have is a BitmapSource, not a Bitmap.

To get a Bitmap instead of a BitmapSource, we'd have to change code outside this file. I don't know which consequences this may have.

Author:  SiliconMind [ Tue Mar 27, 2012 1:45 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

Thomas Hoevel wrote:
I hope I didn't miss something: We cannot currently call Bitmap.LockBits inside ReadTrueColorMemoryBitmap because we don't have a Bitmap there. What we have is a BitmapSource, not a Bitmap.

To get a Bitmap instead of a BitmapSource, we'd have to change code outside this file. I don't know which consequences this may have.


ReadTrueColorMemoryBitmap uses XImage.gdiImage field to get byte data for processing. The XImage.gdiImage is just a System.Drawing.Image so it I think that it is possible to use LockBits. Or it's me who's missing something... which is possible as I do not know the big picture of a whole PDFSarp lib source.

Author:  Thomas Hoevel [ Tue Mar 27, 2012 2:58 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

The GDI+ build uses System.Drawing.Image which is the base class of System.Drawing.Bitmap.
LockBits is implemented by Bitmap, not Image.
The WPF build uses System.Windows.Media.Imaging.BitmapSource.

I falsely assumed you were using the WPF build. It might be worth trying the WPF build just in case this problem is used by a limitation of GDI+ resources and not a LOH fragmentation problem.
I'd try the WPF build.

Author:  SiliconMind [ Tue Mar 27, 2012 3:46 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

Thomas Hoevel wrote:
The GDI+ build uses System.Drawing.Image which is the base class of System.Drawing.Bitmap.
LockBits is implemented by Bitmap, not Image.
The WPF build uses System.Windows.Media.Imaging.BitmapSource.

I falsely assumed you were using the WPF build. It might be worth trying the WPF build just in case this problem is used by a limitation of GDI+ resources and not a LOH fragmentation problem.
I'd try the WPF build.

Doh, you're right. I forgot about WPF implementation. But still even with current WPF version, the issue with additional byte array and MemoryStream remains.

First, although very small, workaround for this issue is not to copy MemoryStream into byte array (lines 362-365) but to use MemoryStream.GetBuffer(). That way we have one less byte array to allocate. I've tested this and the OutOfMemoryException was not thrown until 40th PNG file or so (earlier it was about 25 PNGs). So we've got an Improvement.

LockBits route is still possible. For WPF variant you could create a temporary WritableBitmap using constructor WritableBitmap(BitmapSource). Then use WritableBitmap.BackBuffer property which returns pointer to the bitmap contents - exactly like BitmapData.Scan0 that is returned by the call to System.Drawing.Bitmap.LockBits() method.

Another advantage of this approach is that you don't need that ugly ReadWord stuff and all these hardcoded values for testing bitmap compatibility.

Author:  SiliconMind [ Wed Mar 28, 2012 1:29 pm ]
Post subject:  Re: OutOfMemoryException with ARGB images

If you need a copy of MemoryStream data, you can just use MemoryStream.ToArray(); Instead of this:
Code:
imageBits = new byte[streamLength];
memory.Seek(0, SeekOrigin.Begin);
memory.Read(imageBits, 0, streamLength);


But remember that MemoryStream.ToArray() creates a copy of a byte array encapsulated by MemoryStream. To reduce memory usage and improve performance you should use MemoryStream.GetBuffer();
However note that MemoryStream.GetBuffer() method returns the entire allocated buffer (even the unused buffer) with padding for the unused buffer space. So you should never use imageBits.Length if you want to know the actual data length inside the array. Instead use helper variable:
Code:
int streamLength = memory.Length;
byte[] imageBits = memory.GetBuffer();
memory.Close();


As a matter of fact MemoryStream.Close() or MemoryStream.Dispose() does nothing to release memory used by the MemoryStream. So thanks to MemoryStream.GetBuffer() you can reuse the byte array and omit creating new one.

I've attached a modified PdfImage.cs file with appropriate changes. I didn't have time to try WritableBitmap.BackBuffer / Bitmap.LockBits solution but I still believe that it is the right way to go. You should consider that.

I've also noticed that inside method FlateDecode.Encode() you do something like this:
Code:
MemoryStream.Capacity = MemoryStream.Length;
return MemoryStream.GetBuffer();

By changing memory stream's capacity you create another copy of a byte array used by that memory stream. It's another easy way to hit the LOH issue. You could return whole buffer and then write to pdf only actual data, without padding bytes. You would have to store real data length or use reverse loop before writing bytes to a file, to get actual data length inside buffer array.

Attachments:
File comment: Large Object Heap fragmentation fix
PdfImage.zip [10.29 KiB]
Downloaded 727 times

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/