PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 10:32 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Converting to PDF v1.4
PostPosted: Mon May 07, 2012 3:09 pm 
Offline

Joined: Fri May 04, 2012 6:59 pm
Posts: 4
Hi

Since I couldn't use different version because iRef error comes, I decided to convert all files to PDF 1.4, so I needed to use the iTextSharp workaround, but now I have other problems:

1) With older versions than 1.4 sometimes I get an error when trying to use PageCount property (PdfDocument.cs). I debugged and fount that pageTreeRoot could be null, because of that I have to validate and therefore return 1 if that variable is null (yea, it could be wrong but I needed to continue with the entire process of splitting thousands of PDF files.

2) Before "validating" I try to use Pages.Count, but method GetKids (PdfPages.cs) stopped by an error occurred when xref3=null. Again, I used an "if" statement to workaround this.

3) How do I do to detect bad files? I mean, some of the files are corrupted so I want to know if there is a way to know it instead of the try-catch use.
The exception are:
a) "Rebuild failed: '>' Not expected at file pointer 3; Original message: PDF startxref not found".
b) "PDF header signature not found"
c) "Error reading string at file pointer 64"

As you can see I need assistance in order to continue my project with the right directions, by the other way, maybe I can help you to find bugs/errors and try to perfectionate this great tool (I'm interested on exporting PNG images from PDF files).

Note: My users are using 1.2, 1.3, 1.4 and 1.6 PDF version (and maybe more, I've just dettected those).

Thanks in advance.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 07, 2012 9:09 pm 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 339
Hi!

PDFsharp was written to read all files that strictly follow the PDF v1.4 specification.

Many 3rd party applications do not strictly follow those specs - and Adobe Readers tolerates a lot of deviations.
We update PDFsharp when we encounter files that work with Adobe Acrobat, but not with PDFsharp.
Your point 1 and 2 could indicate such issues.

Re 3: Adobe Reader tries to make the best of corrupted files. PDFsharp was not designed to correct errors and we won't try to do it.
PDF contains many nested structures and corruption can occur at every level.
Simple test: open the file with Adobe Reader and Adobe Acrobat. If either wants to save the file on close, then it fixed a corruption. If you find a way to fix/ignore the corruption, feel free to submit your modifications.
If the file works perfectly with Adobe Reader and Adobe Acrobat and neither of those wants to save on close, then it's probably a deviation from standards that PDFsharp does not yet tolerate. Send us your modifications if you fix it or submit the PDF files for investigation.
What to do with really corrupt files? Corruption can occur anywhere. PDFsharp could detect corruption and throw "PdfCorruptionExceptions", but that would require lots of "if" statements - and you would ignore these exceptions with a try/catch and still reject the file.
Since corruption can occur anywhere in the nested PDF structures, expect to see many different exceptions.

The main purpose of PDFsharp was creation of new PDF files with the option to import pages from existing PDF files (that conform to the standards). It's a tough job to make it compatible with non-corrupt files from various 3rd party tools that omit elements that were required with the v1.4 specification.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
PostPosted: Tue May 08, 2012 6:55 pm 
Offline

Joined: Fri May 04, 2012 6:59 pm
Posts: 4
Thanks, I appreciate your comments. :wink:


Top
 Profile  
Reply with quote  
PostPosted: Sat May 19, 2012 7:32 am 
Offline
PDFsharp Expert
User avatar

Joined: Wed Dec 09, 2009 8:59 am
Posts: 339
Hi!
eks wrote:
Note: My users are using 1.2, 1.3, 1.4 and 1.6 PDF version (and maybe more, I've just dettected those).
Can you provide sample PDF files that lead to the errors you listed as "1)" and "2)"? Thanks in advance.

_________________
Öhmesh Volta ("() => true")
PDFsharp Team Holiday Substitute


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 66 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group