PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Fri Nov 08, 2024 10:41 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: PDF Size
PostPosted: Tue Jan 03, 2023 3:17 pm 
Offline

Joined: Thu Dec 22, 2022 4:24 pm
Posts: 5
I did not find anything in the wiki or forum (at least with my search terms).

I am adding custom meta data ("item1: current time") to existing PDFs, via PDFSharp. I noticed the file sizes doubling. I also noticed the inline tags (the '/xxxx') being split out to separate lines. I was wondering if someone could shed some light on what is going on with those file sizes (e.g. 866k > 1731k). This seems pretty consistent - I have tried 5 different files from different sources.

I am implementing:
PdfDocument NewPDF = new PdfDocument();
NewPDF = PdfReader.Open(memStream);

NewPDF.Info.Elements.Add(new KeyValuePair<string, PdfItem>("/Item1, new PdfString(DateTime.Not.ToString())));
NewPDF.Save((Stream)memStream, false);


Top
 Profile  
Reply with quote  
 Post subject: Re: PDF Size
PostPosted: Wed Jan 04, 2023 9:55 am 
Offline
PDFsharp Guru
User avatar

Joined: Sat Mar 14, 2015 10:15 am
Posts: 1007
Location: CCAA
Leifr wrote:
I was wondering if someone could shed some light on what is going on with those file sizes (e.g. 866k > 1731k).
Without the actual files, I can only speculate.
PDFsharp does not yet support compressing everything that can be compressed.
So some items might be compressed before the modification, but uncompressed after the modification.

The DEBUG build of PDFsharp produces "readable" and commented PDF files that are much larger than PDF files created with the RELEASE build. Both builds allow to tweak the compression settings.

Are you using a DEBUG build?

_________________
Best regards
Thomas
(Freelance Software Developer with several years of MigraDoc/PDFsharp experience)


Top
 Profile  
Reply with quote  
 Post subject: Re: PDF Size
PostPosted: Mon Jan 09, 2023 1:16 pm 
Offline

Joined: Thu Dec 22, 2022 4:24 pm
Posts: 5
TH-Soft wrote:
Are you using a DEBUG build?


Had to refocus on a different feature last week (such is work). I tried with a local console app (debug build), using the stamp only file, and the increase is negligible. The AWS server, on the other hand, is producing those giant files, once I add the meta data. Have you (or anyone) heard of AWS having this issue?

Here are a couple of files, before and after
Stamp only
https://drive.google.com/file/d/1E4d1jX ... share_link

Stamp and Meta Data
https://drive.google.com/file/d/1TQbgYf ... sp=sharing

*** Edit - 2023-01-10
Comparing the two files, it would seem that it is copying the original PDF and adding the Header and Object sections in a second time, with the Meta data (rather than updating the original section). Not really sure how to handle this. I manually removed the old section, and it loads fine.


Top
 Profile  
Reply with quote  
 Post subject: Re: PDF Size
PostPosted: Fri Jan 27, 2023 2:49 pm 
Offline

Joined: Thu Dec 22, 2022 4:24 pm
Posts: 5
Final update on this, in case anyone else runs into this - I used QPDF to "linearize" PDFs first, and this solves most issues. This fixes bad reference tables (these can cause PDFSharp to error out) and redefines streams.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 28 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group