Rethinking Numeric and rounding

Sun Jun 1 23:14:54 EDT 2014

On Jun 1, 2014, at 1:07 PM, Christian Stimming wrote:

> Am Samstag, 31. Mai 2014, 15:29:12 schrieb John Ralls:
>>>>> Is this http://en.wikipedia.org/wiki/Decimal_floating_point what you're
>>>>> talking about? 
>>> 
>>> But back to your initial question: You said we occasionally "encounter
>>> overflow errors". I don't understand (yet) what the actual problem is.
>>> With
>>> our current rational numbers and int64_t numerator we have approx. 19
>>> decimal digits of precision (see [2] for the digits of a 64 bit signed
>>> integer), if I consider the numerator as fully used.
>> 
>> The core of the problem is that 2^63 (remember the sign bit!) is 9.2E18 and
>> some of that range is consumed in the fractional value so with our current
>> 6-digit max fraction for securities the largest number of units
>> representable is 9E12, a pretty big number. There have been numerous
>> requests over the years, principally from the Bitcoin community, that we
>> increase the max fraction to 8 or 10. That would reduce the max number to
>> 9E10 or 9E8, which is starting to get uncomfortable in some currencies;
> 
> Yes, you're right with the numbers: An int64 can represent numbers with 19 
> significant digits, that's what I wrote above (or maybe 18). As we have 
> decided to do the rounding with 6 digits after the decimal point, this leaves 
> us 12 digits in front of the decimal point, which is sufficient for our 
> application domain. Decreasing this to 10 or 8 is not sufficient anymore, I 
> agree on that. 
> 
> However, this always refers to a rounding mode with a fixed number of digits 
> after the decimal point. On the other hand, floating-point arithmetic by 
> definition has the decimal point moving, hence the number of significant 
> digits can be moved from before to after the decimal point and vice versa. We 
> don't have this flexibility here so far. But maybe even that isn't a problem.
> 
>> I don’t see rounding as an issue with rational numbers. It’s used only for
>> display, and the internal representation is always exact. This sometimes
>> trips me up when I’m doing taxes because my balance sheet will have a $.01
>> discrepancy between Assets and Liabilities + Equity because I have to take
>> the representations and transfer them to my tax software which then adds
>> the rounded values instead of the exact ones. That would change in the face
>> of a decimal internal representation, whether it’s binary or BCD. We’d need
>> to allow for spare digits so that the rounding could be made insignificant
>> at the maximum fraction level, hence my suggestion for a 12-decimal
>> fixed-point on two uint64s.
> 
> From my point of view the problem is still not yet understood completely here. 
> IMHO the problem is not too much rounding -- the problem instead is too little 
> rounding. What are the exact requirements on the handling of rounding and 
> overflow from our application domain? My argument is that our application 
> domain of *managing finances* will give us the requirement to do the rounding 
> according to normal currency numbers! Instead, our implementation using 
> rational numbers tries to be "more precise" than rounding to currency numbers. 
> 
> In effect, this is not more precise but rather more wrong. We must not pretend 
> to be more clever than the calculations that happen in reality. Speaking of 
> finance calculations, we must not do the calculations with lesser rounding 
> than what IFRS or similar authorities will clearly specify. Hence, your 
> statement that our "internal representation is always exact" is correct from 
> our programmer's point of view, but IMHO it is not true from the application 
> requirement point of view (due to missing financial rounding). Your $0.01 
> discrepancy is a symptom of exactly that. This doesn't mean our calculation is 
> "more correct", it unfortunately just means our calculation is wrong.
> 
> Here's an example where the missing rounding will lead to wrong results: Let's 
> do some currency exchange, say between USD and EUR. Let's assume an exchange 
> rate of 1.50 USD = 1.00 EUR (given with normal currency SCU [1] i.e. 2 digits 
> after the decimal point here) or almost equivalently an exchange rate of 
> 0.6667 EUR/USD (given with more digits after the point). Let's say the user 
> wants to enter some transactions where she sold 1 USD with the given rate 
> 0.6667. She enters those numbers in the txn dialog: 1.00 USD, rate 0.6667, and 
> gnucash will calculate the resulting third number: 0.6667 EUR, but the display 
> will show 0.67 EUR due to the currency's SCU. The user presses Ok because the 
> resulting 0.67 EUR will match the number on the receipt, i.e. the 0.67 EUR 
> were the amount that resulted in reality. However, if the gnucash account 
> contains "the exact value" 0.6667 [2] and we don't do the correct rounding of 
> this exchange transaction to the currency's SCU, the gnucash account shows 
> 0.67 EUR but internally contains a little bit less. Now the user enters a 
> second identical transaction: 1.00 USD, rate 0.6667, the displayed EUR value 
> is 0.67 which is the amount from reality. The user presses Ok again and she 
> has 2 * 0.67 EUR in reality = 1.34 EUR. However, in gnucash, the account 
> contains 2 * 0.6667 EUR = 1.3334 EUR, rounded for display to 1.33 EUR. Huh, 
> where did that 0.01 EUR go missing?
> 
> My conclusion: We need to introduce more intentional rounding. Every time when 
> the result of a calculation represents a monetary amount that has some known 
> SCU (either from the currency or from the account), this amount needs to be 
> rounded to exactly those digits. No more, no less. And not only for display, 
> but for the real amounts.
> 
> The overflow of the int64 rational numbers is not a problem but inevitable and 
> probably completely fine. Any numerical data type must have some restriction 
> on its precision sooner or later, as long as our computers have to live with 
> finite amounts of memory. Every division and multiplication will increase the 
> significant digits of the resulting number. This just says that in computer 
> arithmetics there will always be the question of when to do the rounding. We 
> can't get around this. So we should better do the rounding according to the 
> application domain's requirements, which in our case means the rounding 
> happens rather soon. No need to get more significant digits from somewhere. 
> The number data type could have been chosen more suitable to our application 
> domain (which is what I pointed out by the decimal64 floating point format), 
> but the resulting 19 significant digits of gnc_numeric are just fine.
> 
> Did I get something completely wrong here? If we are missing the intentional 
> rounding of monetary values, this will be a problem, but the finite precision 
> of gnc_numeric will probably not be a problem.
> 
> Best Regards,
> 
> Christian
> 
> 
> 
> [1] SCU = Smallest Convertible Unit, our gnucash-internal abbreviation for the 
> smallest traded unit of a commodity. E.g., for USD this is 0.01 USD.
> 
> [2] To be honest, for this example I haven't checked whether gnucash really 
> doesn't round a calculated "To"-amount to the target currency's SCU in this 
> use case. If we had the rounding here, fortunately the described error would  
> not exist. But nevertheless I'm telling the example to describe how a missing 
> rounding of currency amounts will lead to wrong resulting amounts.

Well, mostly. 

Overflow *is* a  problem because it's an unrecoverable failure in the calculation. At the point an overflow happens everything stops, the user is notified of the error, and the results of the calculation are wrong. 

The internal representation of a the rate from a $1.50<->€1.00 exchange is 2/3, not 0.6667. The rounding has to occur after the calculation, so that you wind up with €X.67 regardless of the magnitude of X. That becomes more apparent for more complicated calculations where for perfect accuracy intermediate values shouldn't be converted to decimal and therefore rounded, only the final value should be.

That's for perfect accuracy. If there exist accounting standards, either GAAP for the US or IFRS everywhere else, which specify how to do rounding in complicated calculations then we should of course implement them if we can find and get access to the standards. ISTM that that rounding policy will wind up at a higher level than the base arithmetic class, but I'll reserve judgement until one of us can find such a standard and we can review it.

I haven't exhaustively reviewed the code, but my understanding from what I've seen is that we don't routinely round anything except currency amounts for display. That can produce imbalances when the display amounts are transferred 
out of GnuCash. In the absence of a standard I don't know whether GnuCash is wrong or that the process of transferring report results to an external program is inherently flawed.

Regards,
John Ralls