PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Tue Apr 23, 2024 1:00 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 6 posts ] 
Author Message
PostPosted: Wed Jun 15, 2016 1:03 pm 
Offline

Joined: Wed Jun 15, 2016 12:50 pm
Posts: 3
Hi, I have just started using PDF Sharp to create a little tool here in the office. Its a c# written application that runs from Autodesk Revit product, and it splits a PDF file and then directly renames the file to match a drawing and project number and revision. I got the split and naming to work (wow thanks saves a ton of work") however I do not understand why the file size of the split PDFs are almost equal in size to the original file? How is this possible and how can I prevent that? they are vector printed drawings, although some are raster images (which are actually smaller than the vector ones).

the code I use to create the split is as follows:


Code:
                    for (int idx = 0; idx < inputDocument.PageCount; idx++)
                    {

                        // Create new document
                        PdfDocument outputDocument = new PdfDocument();
                        outputDocument.Version = inputDocument.Version;
                        outputDocument.Info.Title = String.Format("Page {0} of {1}", idx + 1, inputDocument.Info.Title);
                        outputDocument.Info.Creator = inputDocument.Info.Creator;

                        // Add the page and save it
                        outputDocument.AddPage(inputDocument.Pages[idx]);

                        // Get name through Revit API
                        FileName = GetFileName();

                        if (File.Exists(directory + FileName + ".pdf"))
                        {
                            OverwrittenNamesString.Add(FileName);
                        }

                        outputDocument.Save(Path.Combine(directory, FileName + ".pdf"));

                        iCount++;
                    }



See the screenshot showing the original file (55 Moorgate - Demolition pdf) D004 to D006 are raster images and the other vector ones.
funny enough D201 is small again althou almost identical to D200.

Thanks if anybody can help.


Attachments:
File comment: Filesizes
Screenshot_1.jpg
Screenshot_1.jpg [ 66.04 KiB | Viewed 5867 times ]
Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 1:19 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3096
Location: Cologne, Germany
Hi!

PDF files have a list of resources for each page. PDFsharp includes all listed resources.

Some PDF files take a simple approach: there is a single list of resources and all pages refer to this list. In this case, PDFsharp will include all resources (even those that are not needed) when splitting PDF files.

PDFsharp does not analyze the contents of PDF pages to remove unneeded resources, it relies on information in the PDF file.
This could be the problem with your file - just speculation as you only include a screenshot, but no PDF.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 1:36 pm 
Offline

Joined: Wed Jun 15, 2016 12:50 pm
Posts: 3
Hi Thomas,

Thank you for your quick reply. I was not aware of that, also the code seems to suggest that I only insert one page.
I don't understand then how some split pdfs are small in size, while others go to 6.2 mB while original is 6.19 if the whole list is included.
I am using 1.32 build, is it worth trying the 1.50 build?

How can i purge this resource list?

Thank you

Regards,

PS: The forum does not allow large file size to be uploaded but it can be downloaded here: https://we.tl/ZO0nWTEJW9


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 2:04 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3096
Location: Cologne, Germany
Vector? The PDF file was created from a printer driver and it looks like raster images (JPEG) to me.

PDFsharp 1.50 has more options to create smaller files. This may help. PDFsharp does not support all compression options of PDF files.
See also:
viewtopic.php?p=9647#p9647

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 9:56 pm 
Offline

Joined: Wed Jun 15, 2016 12:50 pm
Posts: 3
YES solved!

I upgraded to 1.50 (but didn't set any options) but the link you sent me suggested to run the application in release build and not debug build.
One of these two did the job, files now nice and small.

Still I don't understand the why, but then again i must admit i haven't studied the pdf sharp source code.

Thank you very much


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 16, 2016 8:34 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3096
Location: Cologne, Germany
Davidveld wrote:
Still I don't understand the why, but then again i must admit i haven't studied the pdf sharp source code.
The DEBUG build creates "human readable" PDF files.
NuGet packages always contain the RELEASE build.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 201 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group