PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Mon Sep 16, 2024 11:57 am

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 6 posts ] 
Author Message
PostPosted: Wed Jul 24, 2024 9:58 am 
Offline

Joined: Thu Jul 11, 2024 5:07 am
Posts: 3
Environment:
- .NET 8
- PdfSharp 6.1.1

XImage.FromStream is not able to work with Streams that are returned from HttpClient and will always throw InvalidOperationException with the Text "Unsupported image format".

Repro code (see attachment):
Attachment:
code.png
code.png [ 87.96 KiB | Viewed 9854 times ]


The reason for this behavior is, that the XImage class wants to check if the Stream is a PDF file and accesses the Position-Property without checking for stream.CanSeek.
Attachment:
exc.png
exc.png [ 46.63 KiB | Viewed 9854 times ]


The workaround is to store the stream again in a MemoryStream or a file (see the second test).

Some personal opinion:
Using XImage.FromStream to import PDF files into a PDF library is .. suspicious at least.
However if that's really a requirement and there's no way to handle that without seeking, a good exception would be nice for developers. Something like 'Cannot import images from non-seekable streams' or the like.


Attachments:
File comment: Repro project
PdfSharpTests.zip [929 Bytes]
Downloaded 667 times
Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 24, 2024 1:03 pm 
Offline
PDFsharp Expert
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 974
Location: CCAA
aithma wrote:
Some personal opinion:
Using XImage.FromStream to import PDF files into a PDF library is .. suspicious at least.
I'm afraid I don't get your point.
The PDF library implements the XImage class and the FromStream method. It is typically used for image from resource streams or for images stored in databases as a byte array.
I do not see anything suspicious here.

I admit the error message should be better in this case. However, in our applications we do not load images via HttpClient, so this case never occurred.

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 25, 2024 4:16 am 
Offline

Joined: Thu Jul 11, 2024 5:07 am
Posts: 3
XImage.FromStream to me as a developer looks like it imports images. I would never think of trying to pass a PDF stream into this.
It's not even documented: https://docs.pdfsharp.net/PDFsharp/Topi ... awing.html

Therefore, I would suggest providing another method XImage.FromPdfStream. It's more obvious what's happening, it works around the issue of seeking in streams and since most people will use XImage.FromStream for "real" image loading, it will improve their performance (CPU, GC memory pressure).

If you cannot break this interface the next developer would be very happy if you just add another check: if(!streak.Seekable) throw new InvalidOperationException("Can't import non-seekable streams");
It took me a while to figure out that the current exception "Unsupported image format" is lying and there's no problem with my image.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 25, 2024 5:17 am 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 344
aithma wrote:
Therefore, I would suggest providing another method XImage.FromPdfStream. It's more obvious what's happening, it works around the issue of seeking in streams and since most people will use XImage.FromStream for "real" image loading, it will improve their performance (CPU, GC memory pressure).
What is a "PdfStream"?

I think I understand what you are after. But loading "real" images currently also requires a seekable stream - which will be created automatically by the Core build, so not much performance to be gained.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 25, 2024 8:01 am 
Offline

Joined: Thu Jul 11, 2024 5:07 am
Posts: 3
Sorry. PdfStream = System.IO.Stream that contains data for a PDF file.

Okay, I wasn't aware of importing "real" :D images also need a seekable stream. We were using PdfSharpCore before and upgraded to PdfSharp latest and "they" didn't have the requirement for seekable streams. I'm not saying PdfSharp should support seekable streams, btw. Just to clarify why I expected that to work.

So I guess the "more specific exception" might the way to go here.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 25, 2024 8:49 am 
Offline
PDFsharp Expert
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 974
Location: CCAA
aithma wrote:
We were using PdfSharpCore before and upgraded to PdfSharp latest and "they" didn't have the requirement for seekable streams.
PdfSharpCore is a partial port of PDFsharp and "they" use an external library for image handling. "They" expect you to specify an Action where PDFsharp takes a Stream as parameter.

"Their" XImage may decrease the quality of the images you load or may increase the file size of the images you load, but hey, at least they do not require seekable streams. I think this behaviour is suspicious.

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 22 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group