PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Sun Jul 14, 2024 6:11 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 3 posts ] 
Author Message
PostPosted: Wed Apr 27, 2011 12:35 pm 
Offline

Joined: Sun Apr 24, 2011 3:08 pm
Posts: 3
When I attempt to open a PDF that was generated using MS ReportViewer I invariably get an assertion in Lexer.cs on line 163 (this is version 1.31) in ScanNextToken saying "Not Implemented".

The character code is 18.

This happens both in my own code, and in the excellent PdfMerge applicaiton written by Charles Van Lingen so I am reasonably sure it is not something I am doing wrong.

The PDF in question displays just fine, and when I run it through a number of different PDF viewers (Adobe, foxit, etc) they all seem to display it without difficulty.

I tried to attach the offending PDF to this post but the forum does not allow it.

Anyone else run into this problem?

Thanks very much... RKM


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 27, 2011 7:24 pm 
Offline

Joined: Sun Apr 24, 2011 3:08 pm
Posts: 3
So I found one problem at least. In the MS Report generated PDF files the "stream" token has an extra blank at the end of it.

I modified ScanKeyword so that it handles this case as follows:
Code:
      // Check known tokens
      switch (this.token.ToString().Trim())
      {
        case "obj":
          return this.symbol = Symbol.Obj;
..........


And that made it fail in a different spot. Unfortunately I am still stuck.

Anybody know how I can figure out what is wrong with this PDF (From PDFSharp's perspective)?

Any help gratefully received. Thanks... RKM


Last edited by DotNetSchnauz on Thu Apr 28, 2011 11:44 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 27, 2011 7:59 pm 
Offline

Joined: Sun Apr 24, 2011 3:08 pm
Posts: 3
So.... I found a way to make this work (for my case anyway) with a few changes to Lexer.cs

I change ScanNextToken line 158 to use a more restrictive custom IsLetter(ch) function instead of Char.IsLetter(ch). Using the debugger I observed some really strange values (in the high 200's for example) being returned as true by Char.IsLetter() and I theorized that maybe these should be treated as not letters.

Code:
      if (IsLetter(ch))
        return this.symbol = ScanKeyword();


which is implemented as follows

Code:
    private bool IsLetter(char c)
    {
        if (c == ' ') return true;
        if (c >= 'a' && c <= 'z') return true;
        if (c >= 'A' && c <= 'Z') return true;
        return false;
    }


Then I changed ScanKeyword() in two spots....

Line 307 I replaced Char.IsLetter(ch) with the same custom IsLetter(ch) method.

Code:
      while (true)
      {
        if (IsLetter(ch))
        {
          this.token.Append(ch);


and line 317 I changed the switch statement to handle errant whitespace characters

Code:
      // Check known tokens
      switch (this.token.ToString().Trim())
      {



Like I say. This worked for me, but there is a very real chance (almost a certainty) that I screwed something up by doing this. I would love feedback from someone who is a little more conversant in the PDF format than I am to tell me what I may have messed up with these changes.

Thanks...

RKM


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 35 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group