Rethinking Numeric: Decimal floats, or Rounding?

Sat Jul 5 05:35:35 EDT 2014

> On July 4, 2014 at 4:59 PM Christian Stimming <christian at cstimming.de> wrote:
>
>
> Am Sonntag, 22. Juni 2014, 13:47:49 schrieb John Ralls:
> > If we’re going to fix that by rounding, we have to round to a power-of-ten
> > denominator, but we have to do it in the right places to avoid accumulating
> > errors; for a complex multi-currency transaction that might be to round
> > each exchange to the target’s SCU before performing the next exchange;
> > interest calculations might be a bit harder. In either case, that won’t
> > necessarily prevent a GCD overflow in the presence of a large, weird,
> > denominator.
> >
> > So you proposed digital floats as a way around that problem. I think it’s a
> > good alternative, and I pursued it. Why are you now having second thoughts?
>
> Hi John,
>
> I'm not at all against the decimal floating point types. The implementation
> looks nice.
>
> However, I probably still didn't get the original problem. It occurred to me
> as if some general overflow situation was detected in our 16 byte rational
> numbers - which is completely normal for any finite-precision number. But if
> such overflow happens during gnucash, it just means that the usual financial
> rounding procedures must be applied. As long as we don't round with less
> precision that what normal financial procedures are doing, we are fine. With
> the exception that sometime we are even fine only if we do exactly the
> necessary rounding and don't keep more precision than what the currency or
> security in question should hold. We've gone through the discussion at length
> so far. This question isn't related to the data type but rather asks whether
> our rounding strategy in general needs some changes towards accepted financial
> rounding rules. From my understanding, calculations with normal financial
> rounding rules can be achieved both with any of the decimal float types in
> question, but probably just as well with our existing 16-byte rational number
> type.
>
> In other words, from my understanding, switching from 16 byte rationals to 8
> byte decimal floats does not make the original problem any worse, but also not
> really any better. What did I miss?

Christian,

I think what you're missing is that an overflow is a show-stopper, and it
happens inside the calculation before any rounding is applied.

You might also be missing that although we use two 8-byte integers to store
numerical values and four for computation, we have to clamp actual values to 5
1/2 bytes (for each) in the random tests to avoid overflows. There isn't any
such clamp in real code except for the max denominator for SCU, but that doesn't
apply to exchange rates computed from actual transactions (as opposed to ones
added via the price editor or F::Q).

There's also the problem that rounding is hard with rationals. What do we round
to, and when? Do we use a back-tracking computation that instead of signaling an
error on an overflow somehow rounds the input and tries again, then signals an
inexact result? That would incur a performance hit, but if overflows in real
life are truly rare even when buying bitcoin with rupiah, then we could live
with that. We'd have to add an out-of-band error field to the existing
gnc-numeric to support it, and then look at every instance where we're using
GNC_HOW_RND_NEVER (10 instances in live code) and figure out how to handle the
rounding; since there's no way to have exact values with floats (2/3 must round
at some point) we have to do that anyway.

So what does mpdecimal buy us when it comes to rounding? A couple of things:
First, it's multiprecision, so it doesn't overflow during calculations. If it
needs more digits it allocates memory for it on the heap and keeps going.
Second, its rounding is always to a power-of-10 denominator, which is a natural
fit with real life in finance. The penalty is that it (like any float) requires
two-step fuzzy comparison to get some of the higher-digit count random
computations to pass. One example:

Compared 1315499882.19624045049582323477529 to 1315499882.1962404504958232347753
got -1 reduced with 34 digits.
FAILURE expected 1315499882.196240450495823234775294 got
1315499882.196240450495823234775295 = 2790062078.910323445814906907732326 -
1474562196.714082995319083672957031 for exact subtraction

Don't freak out over the 34 digits. I picked 128-bit format to start with
because 64-bit BCD gives only 15 significant digits, and 128-bit was an easy
option. I'll tune it later for a smaller footprint. I suppose I should point out
that even with tuning, a decimal float-based gnc_numeric is going to be around
40 bytes in size compared to 20-24 for a rational-int based gnc_numeric with an
added flags field for signaling rounding, and a GncNumeric around 80, though I
might be able to shrink that a bit later.

I just pushed the "fuzzy rounding" changes for your inspection. With that,
test-numeric passes, though some of the other tests crash because there's a
deallocation error that I'm still working on.

Regards,
John Ralls