Rethinking Numeric

Sat May 31 18:29:12 EDT 2014

On May 31, 2014, at 2:19 PM, Christian Stimming <christian at cstimming.de> wrote:

> Am Sonntag, 25. Mai 2014, 07:34:14 schrieb John Ralls:
>>>> If we've reached the point where our int64 rational numbers do not fit
>>>> our problem requirements anymore, I'd rather look for a different number
>>>> representation that fits our application domain better. I'm thinking
>>>> about replacing rational numbers by decimal floating point numbers. That
>>>> is, a number is represented by m * 10^e with the mantissa m and exponent
>>>> e as signed integers. This is different from our normal *binary*
>>>> floating point in that we use the exponent with base 10. However, all
>>>> common rules for floating point can be applied just as normal. By the
>>>> way, maybe there is even a standard comparable to IEEE 754 available?
>>>> 
>>>> Just another possible way to proceed for solving this problem...
>>> 
>>> Is this http://en.wikipedia.org/wiki/Decimal_floating_point what you're
>>> talking about? If so, it says that the IEEE spec is 854, and answers some
>>> of my questions, but leaves out database support.
>> 
>> A different spec: http://speleotrove.com/decimal/decarith.html
>> And an implementation, which is used in CPython:
>> http://www.bytereef.org/mpdecimal/index.html
> 
> Thanks for the pointers: Yes, that was exactly what I was talking about. Turns 
> out the 2008 version of IEEE 754 [1] now also has this included (as 
> "decimal64" etc), but the implementations in everyday compilers and/or 
> hardware come along rather slowly. There are well-established library 
> implementations available, though, such as the one on speleotrove you 
> mentioned, called "decNumber", but others just as well. If we want to, we can 
> very well include a library such as that one into gnucash and start using 
> that.
> 
> But back to your initial question: You said we occasionally "encounter 
> overflow errors". I don't understand (yet) what the actual problem is. With 
> our current rational numbers and int64_t numerator we have approx. 19 decimal 
> digits of precision (see [2] for the digits of a 64 bit signed integer), if I 
> consider the numerator as fully used. 
> 
> Are 19 significant decimal digits not enough? Are there thinkable cases when 
> they are not enough? I tend to think the problem is rather found in our 
> rational number's rounding, which is not the suitable rounding method for our 
> financial application domain. If this is the problem, a different data type 
> that does the rounding always according to decimal numbers, and not according 
> to (in normal float/double calculations) binary floating point numbers, or (in 
> gnc_numeric) according to rational numbers with some potentially unknown 
> denominator. 
> 
> If this is indeed the problem, switching to a data type with strict decimal 
> number behaviour might be the solution. And the IEEE 754-2008 decimal64 type 
> might be one of the possible implementations, available in one of the 
> mentioned libraries. For the record, decimal64 has 16 digits precision [1], 
> i.e. it won't give us more digits in its 8 bytes compared to our 16 bytes so 
> far. Maybe we want decimal128, which has 34 digits precision [1]. My gut 
> feeling says the digits are not the problem and 16 digits are sufficient, but 
> the rounding behaviour is indeed the problem.
> 
> As for database implementations: The speleotrove site [3] says something about 
> some data bases that directly have a DECFLOAT type (such as ABAP) but 
> apparently this is not the case for the databases we're looking at. Hence the 
> storage would have to be done manually, maybe in two integers (significant and 
> exponent), or in a string, but both would require further calculations before 
> they can be used in a query directly.
> 
> Maybe not yet an easy solution available? But what again was the core of the 
> problem?

The core of the problem is that 2^63 (remember the sign bit!) is 9.2E18 and some of that range is consumed in the fractional value so with our current 6-digit max fraction for securities the largest number of units representable is 9E12, a pretty big number. There have been numerous requests over the years, principally from the Bitcoin community, that we increase the max fraction to 8 or 10. That would reduce the max number to 9E10 or 9E8, which is starting to get uncomfortable in some currencies; the worst case in the Wall Street Journal’s currency table is the Indonesiam Rupiah at 11675/USD, followed by the Columbian Peso at 1897. I just checked a quote for Bitcoin in Rupiah: 7318412.03498 for 1 BTC. I can all too easily imagine a calculation involving that large an exchange rate overflowing an int64 if the denominators work out the wrong way. Yes, the denominators of both BTC and IDR are multiples of 10, but the most likely calculation would be calculating the price of x BTC converted to or from y IDR, and there’s no clamp on the fraction in that calculation.

I don’t see rounding as an issue with rational numbers. It’s used only for display, and the internal representation is always exact. This sometimes trips me up when I’m doing taxes because my balance sheet will have a $.01 discrepancy between Assets and Liabilities + Equity because I have to take the representations and transfer them to my tax software which then adds the rounded values instead of the exact ones. That would change in the face of a decimal internal representation, whether it’s binary or BCD. We’d need to allow for spare digits so that the rounding could be made insignificant at the maximum fraction level, hence my suggestion for a 12-decimal fixed-point on two uint64s.

Regards,
John Ralls