PDFsharp & MigraDoc Foundation :: View topic - Bug + patch: Array bounds exception while parsing float

PDFsharp & MigraDoc Foundation https://forum.pdfsharp.net/

Bug + patch: Array bounds exception while parsing float https://forum.pdfsharp.net/viewtopic.php?f=3&t=3426	Page 1 of 1

Author:

Gerben Vos [ Mon Aug 15, 2016 1:14 pm ]

Post subject:

Bug + patch: Array bounds exception while parsing float

What happens:

When parsing a floating point number with 10 or more digits after the decimal point, PdfSharp gives an index out of bounds exception.

Cause:

PdfSharp ignores digits after the 10th decimal. Whether that is a smart move or not is debateable. Anyway, in CLexer.ScanNumber(), there is an off-by-one error between the test "if (decimalDigits < 10)" and the length of the PowersOf10 array.

Fix:

Attached. Added an entry to the PowersOf10 array.

Suggestion:

Why isn't float.Parse() used, possibly after checking that the token only contains digits and a period? That code is much more likely to have been optimized and thoroughly tested for correctness. There are many pitfalls when parsing integers and floats, see for example http://stackoverflow.com/questions/8522 ... m-a-string .

Attachments:

pdfsharp-685.zip [387 Bytes]
Downloaded 470 times

Author:	Gerben Vos [ Mon Aug 15, 2016 1:32 pm ]
Post subject:	Re: Bug + patch: Array bounds exception while parsing float
Problem can be reproduced using the test program attached to viewtopic.php?f=3&t=3411 . Unfortunately, we don't have a non-confidential PDF file that triggers this problem, but it should be fairly straightforward to create one.

Author:	Thomas Hoevel [ Mon Aug 15, 2016 1:36 pm ]
Post subject:	Re: Bug + patch: Array bounds exception while parsing float
Hi! Gerben Vos wrote: PdfSharp ignores digits after the 10th decimal. Whether that is a smart move or not is debateable. Not really IMHO. According to Adobe Reference material, Adobe Reader uses single precision inside (up to 7 decimal digits of precision). From .NET documentation: "A Single value has up to 7 decimal digits of precision, although a maximum of 9 digits is maintained internally". Not much gained by reading more than 10 digits. Gerben Vos wrote: Why isn't float.Parse() used, possibly after checking that the token only contains digits and a period? I don't know. But this is a good question. I'll try to find out. PDFsharp was developed with .NET 1.1 and float.TryParse() was still missing. float.Parse() would have been an option back then - and float.TryParse() is an option today.

Author:	Gerben Vos [ Mon Aug 15, 2016 1:40 pm ]
Post subject:	Re: Bug + patch: Array bounds exception while parsing float
What about, for example, a number like 0.0000000000001234567 ? It has 7 digits of precision, but PdfSharp would treat it as zero. (I haven't checked what Adobe does in this case.)

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group https://www.phpbb.com/