PDFsharp & MigraDoc Foundation • View topic - "file may be corrupt" while it is not

View unanswered posts | View active topics

Board index » PDFsharp & MigraDoc » Support

All times are UTC

Forum rules

Please read this before posting on this forum: Forum Rules

"file may be corrupt" while it is not

Moderator: Stefan Lange

Page 1 of 1

[ 3 posts ]

Print view

Previous topic | Next topic

Author

Message

JeIC2

Post subject: "file may be corrupt" while it is not

Posted: Tue Jan 23, 2018 9:29 am

Joined: Mon Jan 22, 2018 9:57 am
Posts: 1

Hello!

I use PDFSharp 1.50.4740-beta5 to merge 2 pdfs into 1 file. I used 1.32.3057 before on my PDF's, but now I kept getting the famous "Cannot handle iref streams. " on other PDFs. So I solved that by using the 1.50 one. However, now on the PDF I included, and some others, i get the error:

"Unexpected character '0xffff' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file."

However, it is a legit PDF, as I can open it and read it. Also what is interesting to note, is that it does work with 1.30. However, then I can't merge it with the ones that i get the iref streams error, as noted as before. Any way to solve this error? Is this a bug? Help would be appreciated.

Attachments:

PDFthatwontmerge.zip [158.18 KiB]
Downloaded 426 times

Top

Thomas Hoevel

Post subject: Re: "file may be corrupt" while it is not

Posted: Tue Jan 23, 2018 10:21 am

PDFsharp Guru

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3096
Location: Cologne, Germany

Hi!

JeIC2 wrote:

However, it is a legit PDF, as I can open it and read it.

The file is corrupt.
At position 164767 a stream begins. The length of the stream is given as 9979 bytes.
The size of the file is just 171690 bytes, so there are at most 6921 bytes content for that stream, not the 9979 given in the header.
I call that "corrupt".

Yes, Adobe Reader can open the file. And when I use "Save as" in Adobe Reader, I get a file that can be opened with PDFsharp.
Once again Adobe Reader does a better job when it comes to dealing with corrupt files. Adobe Reader sets the length of that stream to 390.

There are some pull requests on GitHub that are meant to improve how PDFsharp deals with corrupted files.
We did not evaluate those changes yet, so they are not included in beta5.
Feel free to try them and please let us know if any of those fixes helps with your file.
https://github.com/empira/PDFsharp/pulls

QPDF also identifies the file as corrupt:

Quote:

checking 119406-VKF_926516_20171003_081415.pdf
PDF Version: 1.3
File is not encrypted
File is not linearized
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 174748): EOF while reading token
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 164769): attempting to recover stream length
WARNING: 119406-VKF_926516_20171003_081415.pdf (object 11 0, file position 164769): recovered stream length: 1859

It comes up with a different stream length than Adobe Reader.

_________________
Regards
Thomas Hoevel
PDFsharp Team

Top

duvidas85

Post subject: Re: "file may be corrupt" while it is not

Posted: Fri Feb 02, 2018 1:54 pm

Joined: Fri Feb 02, 2018 1:51 pm
Posts: 1

The changes on this pull request fixed the problem for me:
https://github.com/empira/PDFsharp/pull/39

I'm using this code since 3rd of February and it's behaving very well. I merged more than 500 different pdfs and only saw an error in 1 file (Unexpected token '\xE3' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp)

Top

Page 1 of 1

[ 3 posts ]

Board index » PDFsharp & MigraDoc » Support

All times are UTC

Who is online

Users browsing this forum: No registered users and 344 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum