PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

PDF Size
https://forum.pdfsharp.net/viewtopic.php?f=2&t=4408
Page 1 of 1

Author:  Leifr [ Tue Jan 03, 2023 3:17 pm ]
Post subject:  PDF Size

I did not find anything in the wiki or forum (at least with my search terms).

I am adding custom meta data ("item1: current time") to existing PDFs, via PDFSharp. I noticed the file sizes doubling. I also noticed the inline tags (the '/xxxx') being split out to separate lines. I was wondering if someone could shed some light on what is going on with those file sizes (e.g. 866k > 1731k). This seems pretty consistent - I have tried 5 different files from different sources.

I am implementing:
PdfDocument NewPDF = new PdfDocument();
NewPDF = PdfReader.Open(memStream);

NewPDF.Info.Elements.Add(new KeyValuePair<string, PdfItem>("/Item1, new PdfString(DateTime.Not.ToString())));
NewPDF.Save((Stream)memStream, false);

Author:  TH-Soft [ Wed Jan 04, 2023 9:55 am ]
Post subject:  Re: PDF Size

Leifr wrote:
I was wondering if someone could shed some light on what is going on with those file sizes (e.g. 866k > 1731k).
Without the actual files, I can only speculate.
PDFsharp does not yet support compressing everything that can be compressed.
So some items might be compressed before the modification, but uncompressed after the modification.

The DEBUG build of PDFsharp produces "readable" and commented PDF files that are much larger than PDF files created with the RELEASE build. Both builds allow to tweak the compression settings.

Are you using a DEBUG build?

Author:  Leifr [ Mon Jan 09, 2023 1:16 pm ]
Post subject:  Re: PDF Size

TH-Soft wrote:
Are you using a DEBUG build?


Had to refocus on a different feature last week (such is work). I tried with a local console app (debug build), using the stamp only file, and the increase is negligible. The AWS server, on the other hand, is producing those giant files, once I add the meta data. Have you (or anyone) heard of AWS having this issue?

Here are a couple of files, before and after
Stamp only
https://drive.google.com/file/d/1E4d1jX ... share_link

Stamp and Meta Data
https://drive.google.com/file/d/1TQbgYf ... sp=sharing

*** Edit - 2023-01-10
Comparing the two files, it would seem that it is copying the original PDF and adding the Header and Object sections in a second time, with the Meta data (rather than updating the original section). Not really sure how to handle this. I manually removed the old section, and it loads fine.

Author:  Leifr [ Fri Jan 27, 2023 2:49 pm ]
Post subject:  Re: PDF Size

Final update on this, in case anyone else runs into this - I used QPDF to "linearize" PDFs first, and this solves most issues. This fixes bad reference tables (these can cause PDFSharp to error out) and redefines streams.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/