PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Bug + patch: Infinite loop in hexadecimal string reader
https://forum.pdfsharp.net/viewtopic.php?f=3&t=3413
Page 1 of 1

Author:  Gerben Vos [ Thu Aug 04, 2016 3:15 pm ]
Post subject:  Bug + patch: Infinite loop in hexadecimal string reader

We have two files that cause PdfSharp to hang while reading it. Both are, unfortunately, confidential client files.

One is actually an uncompressed ZIP file (using the store method) containing two PDFs. Because there is some flexibility in searching for the PDF header and trailer, the ZIP header and trailer are skipped, so the header of the first PDF and the trailer of the second PDF are recognized, and hilarity ensues.

The other is a PDF file with an XRef stream that has a strange xref entry; not sure yet if it is actually corrupt or not, or if there is an additional bug in PdfSharp.

Anyway, in both cases PdfSharp ends up trying to read an object at an incorrect offset in the file, thinks it is a hexadecimal string (which it isn't), and ends up in an infinite loop.

If more detail is needed, I can try to construct a non-confidential PDF exhibiting the problem.

Patch: see attachment.

I also fixed according to the spec (but didn't test) the case when a hexadecimal number is incomplete at the end of a string.

Attachments:
pdfsharp-674.zip [522 Bytes]
Downloaded 479 times

Author:  Gerben Vos [ Thu Aug 04, 2016 3:21 pm ]
Post subject:  Re: Bug + patch: Infinite loop in hexadecimal string reader

Note that the entire appendix H, "Compatibility and Implementation Notes", which contains the bit about scanning the first and last 1024 bytes of the file for the header and trailer, has disappeared from the ISO standard that functions as the PDF 1.7 spec.

Also, I wonder, if the header is found at an offset, would then all object offsets in the xref table be relative to the start of that header, instead of to the beginning of the file? Has someone tested already how Adobe handles this?

Author:  Thomas Hoevel [ Tue Aug 16, 2016 12:49 pm ]
Post subject:  Re: Bug + patch: Infinite loop in hexadecimal string reader

Thanks for the patch.


We have seen files that had about 20000 zero bytes after the trailer and the %%EOF mark - and Adobe Reader opens them without complaining.

Too many files do not have the trailer within the last 1024 bytes.

I have no idea about your header offset question.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/