PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Tue Mar 19, 2024 6:56 am

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 3 posts ] 
Author Message
PostPosted: Tue Jan 23, 2018 9:29 am 
Offline

Joined: Mon Jan 22, 2018 9:57 am
Posts: 1
Hello!

I use PDFSharp 1.50.4740-beta5 to merge 2 pdfs into 1 file. I used 1.32.3057 before on my PDF's, but now I kept getting the famous "Cannot handle iref streams. " on other PDFs. So I solved that by using the 1.50 one. However, now on the PDF I included, and some others, i get the error:

"Unexpected character '0xffff' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file."

However, it is a legit PDF, as I can open it and read it. Also what is interesting to note, is that it does work with 1.30. However, then I can't merge it with the ones that i get the iref streams error, as noted as before. Any way to solve this error? Is this a bug? Help would be appreciated.


Attachments:
PDFthatwontmerge.zip [158.18 KiB]
Downloaded 407 times
Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 23, 2018 10:21 am 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3092
Location: Cologne, Germany
Hi!
JeIC2 wrote:
However, it is a legit PDF, as I can open it and read it.
The file is corrupt.
At position 164767 a stream begins. The length of the stream is given as 9979 bytes.
The size of the file is just 171690 bytes, so there are at most 6921 bytes content for that stream, not the 9979 given in the header.
I call that "corrupt".

Yes, Adobe Reader can open the file. And when I use "Save as" in Adobe Reader, I get a file that can be opened with PDFsharp.
Once again Adobe Reader does a better job when it comes to dealing with corrupt files. Adobe Reader sets the length of that stream to 390.

There are some pull requests on GitHub that are meant to improve how PDFsharp deals with corrupted files.
We did not evaluate those changes yet, so they are not included in beta5.
Feel free to try them and please let us know if any of those fixes helps with your file.
https://github.com/empira/PDFsharp/pulls

QPDF also identifies the file as corrupt:
Quote:
checking 119406-VKF_926516_20171003_081415.pdf
PDF Version: 1.3
File is not encrypted
File is not linearized
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 174748): EOF while reading token
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 164769): attempting to recover stream length
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 164769): recovered stream length: 1859

It comes up with a different stream length than Adobe Reader.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 02, 2018 1:54 pm 
Offline

Joined: Fri Feb 02, 2018 1:51 pm
Posts: 1
The changes on this pull request fixed the problem for me:
https://github.com/empira/PDFsharp/pull/39

I'm using this code since 3rd of February and it's behaving very well. I merged more than 500 different pdfs and only saw an error in 1 file (Unexpected token '\xE3' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 51 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group