PDFsharp & MigraDoc Foundation

PDFsharp - A .NET library for processing PDF & MigraDoc Foundation - Creating documents on the fly
It is currently Thu Mar 28, 2024 2:33 pm

All times are UTC


Forum rules


Please read this before posting on this forum: Forum Rules



Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Mon Aug 15, 2016 1:14 pm 
Offline

Joined: Tue Aug 02, 2016 9:56 am
Posts: 40
Location: Amsterdam, The Netherlands
What happens:

When parsing a floating point number with 10 or more digits after the decimal point, PdfSharp gives an index out of bounds exception.

Cause:

PdfSharp ignores digits after the 10th decimal. Whether that is a smart move or not is debateable. Anyway, in CLexer.ScanNumber(), there is an off-by-one error between the test "if (decimalDigits < 10)" and the length of the PowersOf10 array.

Fix:

Attached. Added an entry to the PowersOf10 array.

Suggestion:

Why isn't float.Parse() used, possibly after checking that the token only contains digits and a period? That code is much more likely to have been optimized and thoroughly tested for correctness. There are many pitfalls when parsing integers and floats, see for example http://stackoverflow.com/questions/8522 ... m-a-string .


Attachments:
pdfsharp-685.zip [387 Bytes]
Downloaded 451 times

_________________
Gerben Vos
Developer


Last edited by Gerben Vos on Mon Aug 15, 2016 1:38 pm, edited 1 time in total.
Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 15, 2016 1:32 pm 
Offline

Joined: Tue Aug 02, 2016 9:56 am
Posts: 40
Location: Amsterdam, The Netherlands
Problem can be reproduced using the test program attached to viewtopic.php?f=3&t=3411 .

Unfortunately, we don't have a non-confidential PDF file that triggers this problem, but it should be fairly straightforward to create one.

_________________
Gerben Vos
Developer


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 15, 2016 1:36 pm 
Offline
PDFsharp Guru
User avatar

Joined: Mon Oct 16, 2006 8:16 am
Posts: 3095
Location: Cologne, Germany
Hi!
Gerben Vos wrote:
PdfSharp ignores digits after the 10th decimal. Whether that is a smart move or not is debateable.
Not really IMHO. According to Adobe Reference material, Adobe Reader uses single precision inside (up to 7 decimal digits of precision).

From .NET documentation: "A Single value has up to 7 decimal digits of precision, although a maximum of 9 digits is maintained internally".
Not much gained by reading more than 10 digits.

Gerben Vos wrote:
Why isn't float.Parse() used, possibly after checking that the token only contains digits and a period?
I don't know. But this is a good question. I'll try to find out.
PDFsharp was developed with .NET 1.1 and float.TryParse() was still missing. float.Parse() would have been an option back then - and float.TryParse() is an option today.

_________________
Regards
Thomas Hoevel
PDFsharp Team


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 15, 2016 1:40 pm 
Offline

Joined: Tue Aug 02, 2016 9:56 am
Posts: 40
Location: Amsterdam, The Netherlands
What about, for example, a number like 0.0000000000001234567 ? It has 7 digits of precision, but PdfSharp would treat it as zero. (I haven't checked what Adobe does in this case.)

_________________
Gerben Vos
Developer


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 36 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Privacy Policy, Data Protection Declaration, Impressum
Powered by phpBB® Forum Software © phpBB Group