PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

ScanHexadecimalString gets stuck in an infinite loop
https://forum.pdfsharp.net/viewtopic.php?f=3&t=399
Page 1 of 1

Author:  Flecto [ Tue Jun 03, 2008 3:21 pm ]
Post subject:  ScanHexadecimalString gets stuck in an infinite loop

The opening angle brackets of the follwoing subdictionary

Code:
<</Predictor 15
/Columns 8
/Colors 3>>


inside a content stream are falsely interpreted as the beginning of hexdecimal number. The result is that ScanHexadecimalString() of CLexer gets stuck in an infinite loop.

A possible solution to this problem might be to check at the beginning of ScanHexadecimalString() whether the NextChar is also '<'. If that's the case, ScanHexadecimalString() shouldn't continue.

Author:  Thomas Hoevel [ Wed Jun 04, 2008 7:49 am ]
Post subject: 

That bug will be fixed with the next release (which is scheduled for summer 2008).

Author:  Soldier-B [ Tue Nov 25, 2008 8:49 pm ]
Post subject: 

Is there a fix for this issue? I took it upon myself to basically reinvent the wheel as far as paring content streams, but I'd rather not rely on regular expressions to do all my heavy lifting. Thanks.

Author:  Thomas Hoevel [ Wed Nov 26, 2008 12:19 pm ]
Post subject: 

If this issue still occurs with PDFsharp 1.20 published June 24, 2008 then please provide us a PDF file that allows us to replicate the issue.

Author:  Soldier-B [ Wed Nov 26, 2008 1:45 pm ]
Post subject: 

Gladly. 140.pdf

Author:  Thomas Hoevel [ Thu Nov 27, 2008 8:18 am ]
Post subject: 

Hi!
Soldier-B wrote:
Gladly. 140.pdf

I tested that file with PdfSharp.Explorer and with the TwoPagesOnOne sample - no endless loop.
I cannot replicate this problem with PDFsharp 1.20 published June 24, 2008.

When does this endless loop occur?

Author:  Soldier-B [ Mon Dec 01, 2008 5:44 pm ]
Post subject: 

When using the ContentReader.ReadContent function, 100% of the time I end up stuck in the ScanHexadecimalString function with the current character always being "<" and the position never advances.

Here's some sample VB code to show you how I get stuck in the loop.

Code:
Sub Main()
    Dim doc As PdfDocument = PdfReader.Open("140.pdf")

    For Each page As PdfPage In doc.Pages
        Dim cseq As CSequence = ContentReader.ReadContent(page)
        Console.WriteLine("This message will not show.")
    Next

    doc.Close()
End Sub


I realize that this code doesn't really accomplish anything, but if you run it you will never get it to output to the console because its stuck in ScanHexadecimalString.

I hope that helps out some and thanks again.

- B

Author:  Soldier-B [ Wed Dec 17, 2008 3:50 pm ]
Post subject: 

Hi Thomas, I was curious to know if you were able to reproduce the infinite loop I get based on the code I posted?

Author:  Thomas Hoevel [ Wed Jan 07, 2009 9:40 am ]
Post subject: 

Yes, I was able to replicate the problem ...
... but because parsing PDF is not my area of expertise I had to forward it to a team member.

I'm back from holiday today. I'll check if the problem was solved.

Author:  Soldier-B [ Mon Jan 12, 2009 2:06 pm ]
Post subject: 

Thanks Thomas.

Author:  Soldier-B [ Thu Apr 09, 2009 1:42 pm ]
Post subject: 

I'm just curious if any progress has been made towards fixing this bug.

Author:  azrafe7 [ Mon Jan 11, 2016 8:24 pm ]
Post subject:  Re: ScanHexadecimalString gets stuck in an infinite loop

Sorry to bring this topic back to life, but I just stumbled in what seems to be the same problem.

I get an infinite loop in ScanHexadecimalString() using the latest version of PDFSharp (v1.32.2608.0) in combination with ContentReader.ReadContent().

I can reproduce the issue with this file for example.

Code:
  var document = PdfSharp.Pdf.IO.PdfReader.Open(inFileName);
  var page = document.Pages[0];
  CObject content = ContentReader.ReadContent(page); // <-- endless loop here


Is there a known fix for this, or maybe I'm doing something wrong?!

Any help would be greatly appreciated. Thanks.

PS: With the posted pdf, the lexer seems to get stucked while parsing '<' followed by '/' in an Artifact token (not sure this info would be of help, but can't hurt, right?!)

Author:  TH-Soft [ Tue Jan 12, 2016 8:11 am ]
Post subject:  Re: ScanHexadecimalString gets stuck in an infinite loop

azrafe7 wrote:
I get an infinite loop in ScanHexadecimalString() using the latest version of PDFSharp (v1.32.2608.0) in combination with ContentReader.ReadContent().
The latest version is PDFsharp 1.50 beta 3.
If the latest version 1.50 cannot read the file then the issue will be investigated.

Author:  azrafe7 [ Tue Jan 12, 2016 12:29 pm ]
Post subject:  Re: ScanHexadecimalString gets stuck in an infinite loop

Sorry, my bad (I'd previously installed it via nuget).

I've retested with 1.50 from github and it works, thanks!

Author:  Thomas Hoevel [ Wed Jan 13, 2016 8:49 am ]
Post subject:  Re: ScanHexadecimalString gets stuck in an infinite loop

azrafe7 wrote:
Sorry, my bad (I'd previously installed it via nuget).
You can get 1.50 also from NuGet (include Pre-Releases in filter).

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/