Rethinking Numeric and rounding
Christian Stimming
christian at cstimming.de
Sun Jun 1 16:07:30 EDT 2014
Am Samstag, 31. Mai 2014, 15:29:12 schrieb John Ralls:
> >>> Is this http://en.wikipedia.org/wiki/Decimal_floating_point what you're
> >>> talking about?
> >
> > But back to your initial question: You said we occasionally "encounter
> > overflow errors". I don't understand (yet) what the actual problem is.
> > With
> > our current rational numbers and int64_t numerator we have approx. 19
> > decimal digits of precision (see [2] for the digits of a 64 bit signed
> > integer), if I consider the numerator as fully used.
>
> The core of the problem is that 2^63 (remember the sign bit!) is 9.2E18 and
> some of that range is consumed in the fractional value so with our current
> 6-digit max fraction for securities the largest number of units
> representable is 9E12, a pretty big number. There have been numerous
> requests over the years, principally from the Bitcoin community, that we
> increase the max fraction to 8 or 10. That would reduce the max number to
> 9E10 or 9E8, which is starting to get uncomfortable in some currencies;
Yes, you're right with the numbers: An int64 can represent numbers with 19
significant digits, that's what I wrote above (or maybe 18). As we have
decided to do the rounding with 6 digits after the decimal point, this leaves
us 12 digits in front of the decimal point, which is sufficient for our
application domain. Decreasing this to 10 or 8 is not sufficient anymore, I
agree on that.
However, this always refers to a rounding mode with a fixed number of digits
after the decimal point. On the other hand, floating-point arithmetic by
definition has the decimal point moving, hence the number of significant
digits can be moved from before to after the decimal point and vice versa. We
don't have this flexibility here so far. But maybe even that isn't a problem.
> I don’t see rounding as an issue with rational numbers. It’s used only for
> display, and the internal representation is always exact. This sometimes
> trips me up when I’m doing taxes because my balance sheet will have a $.01
> discrepancy between Assets and Liabilities + Equity because I have to take
> the representations and transfer them to my tax software which then adds
> the rounded values instead of the exact ones. That would change in the face
> of a decimal internal representation, whether it’s binary or BCD. We’d need
> to allow for spare digits so that the rounding could be made insignificant
> at the maximum fraction level, hence my suggestion for a 12-decimal
> fixed-point on two uint64s.
>From my point of view the problem is still not yet understood completely here.
IMHO the problem is not too much rounding -- the problem instead is too little
rounding. What are the exact requirements on the handling of rounding and
overflow from our application domain? My argument is that our application
domain of *managing finances* will give us the requirement to do the rounding
according to normal currency numbers! Instead, our implementation using
rational numbers tries to be "more precise" than rounding to currency numbers.
In effect, this is not more precise but rather more wrong. We must not pretend
to be more clever than the calculations that happen in reality. Speaking of
finance calculations, we must not do the calculations with lesser rounding
than what IFRS or similar authorities will clearly specify. Hence, your
statement that our "internal representation is always exact" is correct from
our programmer's point of view, but IMHO it is not true from the application
requirement point of view (due to missing financial rounding). Your $0.01
discrepancy is a symptom of exactly that. This doesn't mean our calculation is
"more correct", it unfortunately just means our calculation is wrong.
Here's an example where the missing rounding will lead to wrong results: Let's
do some currency exchange, say between USD and EUR. Let's assume an exchange
rate of 1.50 USD = 1.00 EUR (given with normal currency SCU [1] i.e. 2 digits
after the decimal point here) or almost equivalently an exchange rate of
0.6667 EUR/USD (given with more digits after the point). Let's say the user
wants to enter some transactions where she sold 1 USD with the given rate
0.6667. She enters those numbers in the txn dialog: 1.00 USD, rate 0.6667, and
gnucash will calculate the resulting third number: 0.6667 EUR, but the display
will show 0.67 EUR due to the currency's SCU. The user presses Ok because the
resulting 0.67 EUR will match the number on the receipt, i.e. the 0.67 EUR
were the amount that resulted in reality. However, if the gnucash account
contains "the exact value" 0.6667 [2] and we don't do the correct rounding of
this exchange transaction to the currency's SCU, the gnucash account shows
0.67 EUR but internally contains a little bit less. Now the user enters a
second identical transaction: 1.00 USD, rate 0.6667, the displayed EUR value
is 0.67 which is the amount from reality. The user presses Ok again and she
has 2 * 0.67 EUR in reality = 1.34 EUR. However, in gnucash, the account
contains 2 * 0.6667 EUR = 1.3334 EUR, rounded for display to 1.33 EUR. Huh,
where did that 0.01 EUR go missing?
My conclusion: We need to introduce more intentional rounding. Every time when
the result of a calculation represents a monetary amount that has some known
SCU (either from the currency or from the account), this amount needs to be
rounded to exactly those digits. No more, no less. And not only for display,
but for the real amounts.
The overflow of the int64 rational numbers is not a problem but inevitable and
probably completely fine. Any numerical data type must have some restriction
on its precision sooner or later, as long as our computers have to live with
finite amounts of memory. Every division and multiplication will increase the
significant digits of the resulting number. This just says that in computer
arithmetics there will always be the question of when to do the rounding. We
can't get around this. So we should better do the rounding according to the
application domain's requirements, which in our case means the rounding
happens rather soon. No need to get more significant digits from somewhere.
The number data type could have been chosen more suitable to our application
domain (which is what I pointed out by the decimal64 floating point format),
but the resulting 19 significant digits of gnc_numeric are just fine.
Did I get something completely wrong here? If we are missing the intentional
rounding of monetary values, this will be a problem, but the finite precision
of gnc_numeric will probably not be a problem.
Best Regards,
Christian
[1] SCU = Smallest Convertible Unit, our gnucash-internal abbreviation for the
smallest traded unit of a commodity. E.g., for USD this is 0.01 USD.
[2] To be honest, for this example I haven't checked whether gnucash really
doesn't round a calculated "To"-amount to the target currency's SCU in this
use case. If we had the rounding here, fortunately the described error would
not exist. But nevertheless I'm telling the example to describe how a missing
rounding of currency amounts will lead to wrong resulting amounts.
More information about the gnucash-devel
mailing list