PDFsharp & MigraDoc Foundation
https://forum.pdfsharp.net/

Problem with trailing '\0'
https://forum.pdfsharp.net/viewtopic.php?f=2&t=2234
Page 1 of 1

Author:  ToniM [ Wed Nov 21, 2012 8:18 am ]
Post subject:  Problem with trailing '\0'

Hello Guys,


i run into a problem (caused by SAP) with some '\0' after the '%%EOF' tag in the pdf.

The problem occures exactly in the
Code:
Parser.cs
in the method
Code:
ReadTrailer()
.
There the trailer is searched within a fixed range from the "real" file end.

I think about more solutions to avoid the error 'Unexpected Token' with this solutions:
- extend from 130 to x (not the best way)
- count the '\0' at the end and add to 130 (i also dont really like, but the better way)


What would you suggest me to do?
and is there a reason for the hardcoded "130 byte from the end" method, what i dont know?
why not search for the 'startxref' within the whole document?

Thanks in advance!

BR ToniM

Author:  Thomas Hoevel [ Wed Nov 21, 2012 9:51 am ]
Post subject:  Re: Problem with trailing '\0'

Hi!

Code that searches the complete file:
viewtopic.php?p=583#p583

A better approach: search within 130 bytes at the end of the file, then search the whole file as a fallback strategy.

Them guys at SAP don't know what "%%EOF" stands for. They shouldn't write more bytes than the file actually needs. 50000 zero bytes added at the end of the file make no sense.

Author:  ToniM [ Wed Nov 21, 2012 10:00 am ]
Post subject:  Re: Problem with trailing '\0'

hi,

is there a reason for the 130 bytes?
or is it just a guessed value with some safety?

but i will follow your approach.
Are you interested in the result to check in?

BR ToniM

Author:  Thomas Hoevel [ Wed Nov 21, 2012 10:16 am ]
Post subject:  Re: Problem with trailing '\0'

Initially we read only 30 bytes (enough if "%%EOF" is at the end of the file).
Then we found some PDF files where the producer added the product name after "%%EOF" so we searched 130.

Please post your changes here for us and all. Thanks!

Author:  ToniM [ Wed Nov 21, 2012 10:17 am ]
Post subject:  Re: Problem with trailing '\0'

sorry, i didnt searched well!

please close this thread!

this is the original thread:
http://forum.pdfsharp.net/viewtopic.php?p=583#p583


Thank you & Sorry for the trouble.

BR ToniM


PS: Here is my solution, maybe you like to implement it for the future, to avoid double postings :)
(Version: 1.32.2608)
Code:
    /// <summary>
    /// Reads the iref table and the trailer dictionary.
    /// </summary>
    internal PdfTrailer ReadTrailer()
    {
      //Symbol symbol;
      //string token;
      //int xrefOffset = 0;
      int length = lexer.PdfLength;

      int offset = 0;
#if true
      offset = 130;
#else
      offset = 30;
#endif

      string trail = this.lexer.ReadRawString(length - offset - 1, offset); //lexer.Pdf.Substring(length - 30);
      int idx = trail.IndexOf("startxref");
      if (idx < 0)
      {
        trail = this.lexer.ReadRawString(0, length);
        idx = trail.LastIndexOf("startxref");
        // maybe still not found, but ignore it
        this.lexer.Position = idx;
      }
      else
      {
        this.lexer.Position = length - offset - 1 + idx;
      }

      ReadSymbol(Symbol.StartXRef);
      this.lexer.Position = ReadInteger();

      // Read all trailers
      PdfTrailer trailer;
      while (true)
      {
        trailer = ReadXRefTableAndTrailer(this.document.irefTable);
        // 1st trailer seems to be the best..
        if (this.document.trailer == null)
          this.document.trailer = trailer;
        int prev = trailer.Elements.GetInteger(PdfTrailer.Keys.Prev);
        if (prev == 0)
          break;
        //if (prev > this.lexer.PdfLength)
        //  break;
        this.lexer.Position = prev;
      }

      return this.document.trailer;
    }

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/