PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 4:30 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Tue Sep 10, 2019 4:07 pm 
Offline

Joined: Tue Sep 10, 2019 3:42 pm
Posts: 3
Hi there,

I want to convert a byte array containing a Pdf with one page to a PdfDocument using the following code:

Code:
PdfDocument document;
using (MemoryStream ms = new MemoryStream(pdfBuf))
{
    document = PdfReader.Open(ms, PdfDocumentOpenMode.ReadOnly);
    document.Close();
}

document.Save(@"C:\Temp\OutputFile.pdf"); // Fails

However, this fails with the error message "Cannot save a PDF document with no pages". But if I add one line to the code it works:

Code:
PdfDocument document;
using (MemoryStream ms = new MemoryStream(pdfBuf))
{
    document = PdfReader.Open(ms, PdfDocumentOpenMode.ReadOnly);
    int pages = document.PageCount; // <- One extra line to make this work
    document.Close();
}

document.Save(@"C:\Temp\OutputFile.pdf"); // Works now

It seems like counting the pages updates PdfSharps internal statistics...

Kind regards,
Aleks


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 10, 2019 9:31 pm 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 339
Hi!

Thanks for the feedback.

Does this occur with the stable version of PDFsharp 1.50 or 1.51?
See also:
http://forum.pdfsharp.net/viewtopic.php?f=2&t=832

Re your code:
Why don't you simply write the PDF you have in the buffer into a file? No need to create a temporary PdfDocument.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 11, 2019 1:06 pm 
Offline

Joined: Tue Sep 10, 2019 3:42 pm
Posts: 3
Hi there,

this occurs in both v1.50.5147 (stable) and v1.51.5185 (beta).

Regarding my code: This is just a simple proof and not the real code. I only became aware of the issue because I saved the document at this point for debugging ;)

Kind regards,
Aleks


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 11, 2019 3:09 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
aaalexxx wrote:
Regarding my code: This is just a simple proof and not the real code.
It is just a snippet.
Does this happen for any PDF file with just one page or only for some PDF files with just one page?

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 11, 2019 3:50 pm 
Offline

Joined: Tue Sep 10, 2019 3:42 pm
Posts: 3
Hi,

I just ran the code against a few hundred pdf files and as far as I can say it happens with any document regardless of its number of pages...

Kind regards,
Aleks


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 16, 2020 4:26 pm 
Offline

Joined: Mon Mar 16, 2020 4:23 pm
Posts: 6
I can confirm this is still an issue with 1.50.5147.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 29, 2020 7:00 pm 
Offline

Joined: Mon Jun 29, 2020 6:09 pm
Posts: 1
This is because of line 373 in PdfDocument.cs.

Code:
        void DoSave(PdfWriter writer)
        {
            if (_pages == null || _pages.Count == 0)


The _pages object is not instantiated unless the Pages or PageCount property is first accessed or, the PDF is opened with a Modify flag (this explicit test performed in PdfReader.cs#L472). I noticed that even though nothing would stop someone from calling Pages.Count, the PageCount property internally calls the Pages property but only when CanModify equates to true. The page count in obtained through an alternate means when CanModify equates to false. Yet this is a bit of a moot point since the CanModify property is hard coded to return true. Within PdfReader.cs the Modify flag is explicitly checked and under this condition _pages is created.

The only ways to avoid this error are to open the PDF stream as Modify, or access the Pages/PageCount properties at some point before saving. Perhaps it was assumed a save would never be performed unless Modify were indicated? I can think of many scenarios where a save is required without an intent of modification.

So in cases where the Modify flag was not used, should _pages have been created within a constructor or perhaps the condition should be checking Pages instead of _pages?

I know of at least one monitor that has suffered an untimely death as a result of this defect.

***EDIT***
I have submitted an issue and PR to correct it. https://github.com/empira/PDFsharp/issues/130


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 147 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group